Forums » General Discussion Search

Terms clarification New Reply

Author Post
Posts: 90
Registered: Aug 29, 2008

Hi all,

When I get top_terms, I receive frequencies associated with those terms.

When I get an artist's terms, I receive frequencies and weights associated to each term for those artists.

Why do I receive a different frequency for a term in the overall context compared to the artist-specific context?

Posts: 1113
Registered: Sep 08, 2008

Term frequencies are normalized, so when you get an artist term frequencies, you are getting the frequencies normalized for that artist. If 'britpop' is the most frequent term for the artist, it will get a 1.0.

When getting the top_terms, these are normalized against the whole set of terms. Since 'rock' is the most frequently occurring term, it's frequency will be 1.0.

Hope this helps.

Paul

Posts: 90
Registered: Aug 29, 2008

Paul,

Thank you, that's very clear.

Is there any way of getting a corpus frequency for a term that's outside of the top 1000? For example, I see that Sufjan Stevens gets a 'folk' term, but that term is outside the top 1000, so there's no overall sense of frequency...

Posts: 1113
Registered: Sep 08, 2008

atl, sorry, we currently only expose the top 1000 terms. However, it is surprising that 'folk' isn't in the top 1,000. I think 'folk' may be getting caught in one of the filters that we have to make sure that only musically relevant words are exposed as terms. Let me check on that.

Paul

Posts: 90
Registered: Aug 29, 2008

okay, thanks. I'm fairly sure I can find a way around it.

Yes, 'folk' stood out on a spot check, especially considering 'folk metal', 'folk music', 'folk punk', 'folk rock', 'folk-pop', and 'folktronica' were present in the top 1000. :)

Posts: 90
Registered: Aug 29, 2008

Also, for reference, and to help the knowledge along, "pop" is also missing from the top 1000.

Reply to this Thread

You must log in to post a reply.