Forums » Bugs & Problems Search

unparsable value returned in Analyze New Reply

Author Post
Posts: 90
Registered: Aug 29, 2008

There's a parsing error that occurs in Remix because the underlying Analyze method is returning an illegal value (that Python doesn't convert to float properly).

The value of timbre coefficient #11 (of 12) that was returned from get_segments on the example file "EverythingIsOnTheOne.mp3" was returned as "-". It occurs about 80% through the returned analysis (in the segment starting at 174.61737s).

Here's the call: get_segments

Or, from within Remix:
from echonest import *
a = web.analyze.get_segments("f242f99220869b4cb455ea7735b67452")
audio.fullSegmentsParser(a)

Although Remix could be patched to parse this, I'd strongly call it a bug in Analyze instead.

Posts: 713
Registered: Sep 08, 2008

atl:

Agree, this is a problem in analyze, we are taking a look at it.

Paul

Posts: 90
Registered: Aug 29, 2008

Thanks once again, Paul.

Posts: 90
Registered: Aug 29, 2008

FWIW, it appeared again in a new upload: md5=97723d47271f75704fa5b8e9ea1bf1ca.

Posts: 713
Registered: Sep 08, 2008

atl - thanks, we are on it. - P

Posts: 90
Registered: Aug 29, 2008

Got a TRISTAN-0 computation error now.
Error 8: Computation error (TRISTAN-0)

md5=8738a2bdbc99b4ea084e945dd8bca09b

Posts: 713
Registered: Sep 08, 2008

atl - got it. thanks. Keep'em coming. - P

Posts: 90
Registered: Aug 29, 2008

I'll keep throwing a bunch of things in this thread. Here's a confirmed 500-type error where HTML is returned (and makes Remix barf):

http://developer.echonest.com/api/get_time_signature?api_key=XXXX&md5=389c7ee1eb92a323b7f717780be07bc6

I've also gotten lots more of that "-" value returned in segments, maybe one in four newly uploaded songs?
md5=dbc511140f349ed789690361883f6b25
md5=836077983aa5aba8a7a818d54cad93bf
md5=38d11c099a027b022c3022d76b879385

It seems to always happen in the timbre vector, so at ~1000 segments/song, one in four new songs, it's in a value that happens about one in 50K times? Is that the right order of magnitude?

Posts: 90
Registered: Aug 29, 2008

(or, hmm, about one in 65536 times?)

Posts: 713
Registered: Sep 08, 2008

Atl:

Thanks for the continued reports. Here's a status update. We have a new version of the analyze software coming that should address these problems. It may be released as early as the end of next week. We will let you know.

Paul

Posts: 4
Registered: Mar 14, 2009

For what it's worth, I just got the float parse error:

Computed MD5 of file is bf050111a040c1cacf736e5ec42f43a0 Probing for existing analysis Analysis found. No upload needed. Traceback (most recent call last): File "quanta.py", line 61, in main(input_filename) File "quanta.py", line 25, in main chunks = audio_file.analysis.segments File "/usr/lib/python2.5/site-packages/echonest/audio.py", line 134, in getattribute value = parseFunction(value) File "/usr/lib/python2.5/site-packages/echonest/audio.py", line 655, in fullSegmentsParser loudness_begin = float(l.firstChild.data) ValueError: invalid literal for float(): -

Posts: 713
Registered: Sep 08, 2008

visualmotive - thanks for the report. Paul

Posts: 90
Registered: Aug 29, 2008

Say, was this issue ever resolved? I did a spot check on some of the MD5s that I posted here earlier, and that "-" text node still shows up in the get_segments results. If it was resolved in a software update, is there any procedure for recomputing the bad analyses: those MP3s I have sitting on my hard drive are effectively locked out if not.

(And if it isn't solved, can I suggest you knock this up in priority, so that newcomers aren't tripped up by it anymore?)

Posts: 713
Registered: Sep 08, 2008

atl:

I know that some issues have been addressed, but I'm not sure if the issues for these particular tracks have been resolved. I'll check the status during the morning scrum. For the java client, I've added an option to the trackAPI class to allow parsing errors to be ignored, so when these aberrations occur, they can be ignored and progress can be made. Perhaps a similar option should be added to the python remix library.

Paul

Posts: 90
Registered: Aug 29, 2008

Thanks, Paul.

In the past, I've considered taking that route (actually, pre-remix, I took that route IIRC) of routing around parsing errors like this. It would be trivial to code up. However, I've always ended up arguing against it because it doesn't help anyone else using Analyze with a non-Java, non-Python, or a homegrown toolchain. This is a bug in the API that needs to be fixed.

Posts: 713
Registered: Sep 08, 2008

atl:

I absolutely agree, these bugs need to be fixed. I'll give you an update on the plan in an hour.

Paul

Posts: 2
Registered: Jun 26, 2009

Hi everyone. I just installed the remix API on Windows XP with Python 2.5. The ffmpeg binary is in my python directory and is properly in my PATH. I attempted to run the example code from one.py but I received the following error. Is this the same problem with the return value of the timbre vector from the server or is this a problem with my install? Thanks!

Computed MD5 of file is 4d03a5402efcc6d67fa0426bd968e610 Probing for existing analysis Analysis not found. Uploading... Traceback (most recent call last): File "one.py", line 38, in main(input_filename, output_filename) File "one.py", line 23, in main bars = audiofile.analysis.bars File "C:Python25Libsite-packagesechonestaudio.py", line 148, in getattribute value = parseFunction(value) File "C:Python25Libsite-packagesechonestaudio.py", line 952, in barsParser return dataParser('bar', doc) File "C:Python25Libsite-packagesechonestaudio.py", line 902, in dataParser confidence=float(n.getAttributeNode('confidence').value))) ValueError: invalid literal for float(): NaN

Posts: 113
Registered: Sep 05, 2008

hi pushups, sorry about this. There are two problems: (1) the analyze API is still returning NaNs and Infs under certain (relatively rare) situations and (2) the python code to parse analyze responses is not handling them correctly. Within a few days remix is getting updated to use pyechonest, which at least does not die when parsing nans. If you know your python you can probably patch audio.py to not break when it gets a nan, otherwise, hold still for a few days. Also, other files should work fine.

Posts: 90
Registered: Aug 29, 2008

FWIW, this is indeed different from (and even more rare than) the timbre vector having '-' as a return value. I've seen it before, but a long time ago.

Posts: 2
Registered: Jun 26, 2009

Thanks for the info. I'll be on the lookout for the patch.

Reply to this Thread

You must log in to post a reply.