IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.7 (1 Viewer)

Should this be the default imdb scraper?

  • Yes, I do not want to re-import

    Votes: 19 95.0%
  • No, keep this one seperate

    Votes: 0 0.0%
  • Who cares, I got movies to watch

    Votes: 1 5.0%

  • Total voters
    20
  • Poll closed .

damaster

Portal Pro
November 23, 2007
412
35
Home Country
Canada Canada
Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

Note: This scraper is meant to replace the default imdb.com scraper, so be sure to place it at a higher priority in your list. This will only work for new movies you import. Any existing movie will remain linked to the original scraper used during import of that movie, unless you send them back to the importer. Another solution would be to modify the scraper ID into <id>874902</id>, so that this new scraper becomes the new imdb.com scraper for your existing movies.

Would love to try this scraper but the note above really throws me off. I have to re-scan all of my existing imported movies? That means I'll lose any custom changes, which would really suck.

Is there a better way to integrate this scraper into existing movies? A DB hack perhaps? :)
 

pirivan

Portal Pro
January 19, 2008
62
2
Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

Note: This scraper is meant to replace the default imdb.com scraper, so be sure to place it at a higher priority in your list. This will only work for new movies you import. Any existing movie will remain linked to the original scraper used during import of that movie, unless you send them back to the importer. Another solution would be to modify the scraper ID into <id>874902</id>, so that this new scraper becomes the new imdb.com scraper for your existing movies.

Would love to try this scraper but the note above really throws me off. I have to re-scan all of my existing imported movies? That means I'll lose any custom changes, which would really suck.

Is there a better way to integrate this scraper into existing movies? A DB hack perhaps? :)

Actually if I understand correctly, you don't have to re-scan all of your existing imported movies IF you modify the scraper ID so that his scraper becomes the new imdb.com scraper for existing movies... Or at least that is how I understand it. However, I didn't like the sounds of doing that so I did do a full re-import. The pain in the ass was again fixing all the titles that didn't scan/match properly (and I am sure there are a number that I will notice are matched wrong as I am using MediaPortal that I missed) and then re-setting all the 'watched' flags. I could see that this could be much more of a challenge for someone who had a lot of customizations beyond simply watched or not!
 
D

DMember 49125

Guest
Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

Looks great, Rochess, but I don't get the English movie titles.

I imported the new movie LET ME IN and got a french title: Laisse-moi entrer

Also on MONSTERS VS ALIENS I got: Monsters vs Aliens: A Monstrous IMAX 3D Experience.
It should be without the IMAX 3D Experience part.

Hopefully you can fix this.

I got french & russian (!) titles too.
 

RoChess

Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #14
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    Hi Rochess,
    How can I use danish ratings instead of US or UK - would that be possible?

    There are a few dedicated Danish scrapers based on other website sources that are build and supported by Danish users who know the language. Reason being is that imdb.com is very indifference on what ratings they support. The English speaking markets are well covered, but it gets real iffy on the non-English ones. It would probably be easier for the Danish scraper creator to build an English version of their scraper with Danish ratings then for me to do it the other way around.

    If however the Danish rating is always a direct conversion of the US or UK rating, then I can add a conversion. For example if a US rated movie with PG always means 'A' for Denmark, and you would be fine with that conversion, then it will be indeed easy for me to add it. But prune American ratings don't always match the rest of the world. So let me know.

    Looks great, Rochess, but I don't get the English movie titles.
    I imported the new movie LET ME IN and got a french title: Laisse-moi entrer
    Also on MONSTERS VS ALIENS I got: Monsters vs Aliens: A Monstrous IMAX 3D Experience.
    It should be without the IMAX 3D Experience part.
    Hopefully you can fix this.

    On both movies can you please goto the imdb webpage and copy and paste the HTML sources that you get onto paste2.org website. For example I get Paste2: Next Generation Pastebin - Viewing Paste 1268833 when I do that from USA, but you are getting a different result. The key in this case is when you scroll down to line #49, you will see "US". The scraper looks for this and verifies if it is US, UK, CA, AU and NZ as these are all English speaking countries and it will then use the title as-is (as it will be the correct English title). If not, it will jump to the "Also known as: ..." title, which then becomes the "Laisse-moi entrer" title on an already English title page.

    Unfortunatly I am limited to what imdb.com gives me back, so you will have to help me a little to fix this problem :)

    Would love to try this scraper but the note above really throws me off. I have to re-scan all of my existing imported movies? That means I'll lose any custom changes, which would really suck.

    I'll try to come up with a better explanation then, in short, open the scraper XML file in notepad, edit the <id>...</id> part to match the official imdb.com scraper, import the scraper and you will not have to re-import any existing movie. This scraper will then as far as MovingPictures knows become the new imdb.com scraper. However I had to keep this one seperate, because it does a lot more then the default scraper and not everybody might want that (hence the poll). For example the default RottenTomatoes score is a big one.

    So for a proper database you would have to refresh all your movies, so that they can all get this RT score (unless of course you change all the global_options). So on purpose I did not release a version with the same ID as the official imdb.com scraper, because I want the users of this one to look at all the options and be comfortable to edit the scraper settings in notepad.

    However what I will do in the next version is make it easier to switch to the official imdb.com ID by adding it in the header under the comments. That way you can copy and paste from within the scraper source to adjust.

    The pain in the ass was again fixing all the titles that didn't scan/match properly

    It would really help me (and others) if you could provide me with a list of filenames that didn't scan/match. Perhaps it will be possible for me to improve the scraper, so that this will go better next time. And I run this scraper myself with the changed ID because I did not wanted to loose my watched status and all the adjustments to titles I had already done. As I explained to damaster the next version will make it a little easier to switch. The only problem then is that you have to make those changes everytime a new version comes out (unless you prefer other global_options then the default settings). But I will adjust the explanation on first post as well then when I release next version to fix the problems reported by the other users. -- let me know if my new explanation is less confusing :D

    I got french & russian (!) titles too.

    Hi Gix, can you please read above to my reply to 'mat123', the same goes for you, I am sure imdb.com has more country codes in English that I need to add, but I will need your help to find them. So please paste2.org me the HTML source on a movie that failed, so I can fix it.
     
    D

    DMember 49125

    Guest
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    Hi Gix, can you please read above to my reply to 'mat123', the same goes for you, I am sure imdb.com has more country codes in English that I need to add, but I will need your help to find them. So please paste2.org me the HTML source on a movie that failed, so I can fix it.

    Paste2: Next Generation Pastebin - Viewing Paste 1269399

    Thank you for fixing that. It is very annoying.:D
     

    pirivan

    Portal Pro
    January 19, 2008
    62
    2
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    RoChess

    I apologize if my previous post came off the wrong way; I didn't mean it was really a problem with the scraper. Every scraper I have ever used has some issues matching titles. I have always felt that this was just the way that it is given that how I have named files may or may not always line up with how they are named in the database they are being matched against (whether that be IMDB, RT, thetvdb etc etc). Not some inherent problem at all necessarily with what you setup. Anyhow, I wish I had written down some of the files I had to 'tweak' names on to get them to match but I did not. I COULD re-send it all to the scanner to find out but I would rather not :). Anyhow, I do recall that it was a bit frustrating that all of my anime films I have titled in English and when they are matched against the DB the "match" is their Asian language title. So, I look at it and not knowing any Japanese etc I either have to A) Look up what the title is in Japanese online and determine that it is indeed a match or B) Just guess that it probably matched correctly (which is what I did). I don't know if this is a huger issue but I thought I would mention it.

    So far the only real oddity I have noticed is that some movies just don't appear to be getting the correct rating for some reason. Two specific examples I have found so far are:

    Big Fan Movie Reviews, Pictures - Rotten Tomatoes
    Private Parts Movie Reviews, Pictures - Rotten Tomatoes

    In the scraper big fan gets returned as having a rating of 6.8 when it should be 8.8 and Private parts gets returned as 6.0 when it should be 7.9. I have set both movies back to the importer or tried refreshing from IMDB+ but I get the same results.

    There are probably other examples of this in my collection but I have yet to notice yet and it is a bit laborious to search for every movie on Rottentomatoes and compare the tomatometer rating with what MP shows :).

    Anyhow, I hope the information is helpful; thanks again for the great scraper work!
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    Did they move around their servers or something? For some reason True Grit is imported as Cent dollars pour un shérif.
     

    Matt Kirby

    Portal Member
    June 14, 2009
    43
    8
    Home Country
    United Kingdom United Kingdom
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    I'm in the UK and I've also had issues getting the wrong title for films, and I've come up with a work-around / fix.

    I followed Rochess's notes on looking at the HTML source from IMDB, and I was seeing "GB" for the country-code. As RoChess said, this scraper checks to find one of US|UK|CA|AU|NZ, if it does find one of these codes it uses the main title, if not it uses the "Also known as" title.

    If anyone (in the UK) wants to manually fix this themselves, it's a fairly simple fix:
    Open the .xml file for this scraper
    Search for "US|UK|CA|AU|NZ", and replace with "US|UK|GB|CA|AU|NZ" (not sure if UK is even needed or used, my guess is that it doesn't hurt to leave it in!)
    Save
    You will then need to remove this scrapper from MovingPictures and re-add it I think (not too sure on this one!)
    Then, refresh movie details from internet for each of your incorrect films

    Now for some guess-work. For people who are not in US|UK|CA|AU|NZ who want to use the main IMDB title using this scraper, you might be able to get it to work by doing the following, but a lot of this is guess-work, and I have no idea if this will cause other issues, so do so at your own risk:
    Search for a film on IMDB, and then view the HTML source of the film's page. On line 49 you should (hopefully) find the country code that IMDB is using for your country. It should be a 2 letter code, surrounded by quote marks, and should have "title" in the line above. I am making several assumptions here- the country code might not always be on line 49, and I have no idea how IMDB's region detection works for their website!
    Assuming that you've found your country-code (and you want to use the main IMDB title rather than the "Also known as" title), edit the .xml file (as above) to enter your country code to the filter list.

    Once again, this is based on guess-work and several assumptions, so YMMV.

    RoChess: thanks for all your work on this, this scraper is great! I've always wanted my films collection to show the UK cinema rating, which this scraper does for me excellently. Could you add "GB" to the official version of this scraper, as "GB" is the correct country code for the UK and seems to be what IMDB is using. I don't think that UK would even be needed in your scraper. Many thanks.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #19
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    I apologize if my previous post came off the wrong way..... Anyhow, I do recall that it was a bit frustrating that all of my anime films I have titled in English and when they are matched against the DB the "match" is their Asian language title.

    In the scraper big fan gets returned as having a rating of 6.8 when it should be 8.8 and Private parts gets returned as 6.0 when it should be 7.9.

    No problem, it didn't came off wrong, I just thought I needed to eleborate better. As for the asian titles matching with their original name, I might be able to correct that. See the English title only gets corrected inside the DETAILS node, but the matching occurs in the SEARCH node, and the latter relies on akas.imdb.com which shows more original titles (the Asian title in this case). It is indeed still the correct movie based on the English title in the AKA results, but it is indeed tough to find out if it is the right one. The problem will be complicated as some users will have the Asian title in their filename and then it would fail if the English one is selected.

    But I got an extensive Asian collection myself, so at least I can test this myself. Unfortunatly only to some degree with the geographic location based title translation being a major pain in the :D

    Thank you for fixing that. It is very annoying.:D

    Did they move around their servers or something? For some reason True Grit is imported as Cent dollars pour un shérif.

    I'm in the UK and I've also had issues getting the wrong title for films...... Could you add "GB" to the official version of this scraper, as "GB" is the correct country code for the UK and seems to be what IMDB is using. I don't think that UK would even be needed in your scraper. Many thanks.

    Ok, it looks like my assumptions on how IMDb handles the title in other countries was wrong, my only reference has been based on US (America), NL (The Netherlands) and AU (Australia). I figured I add in New Zealand (NZ), United Kingdom (UK) and Canada (CA) for good measures, but thanks to Matt it already is clear now that IMDB uses GB for Great Britain instead.

    The problem now is that a country like Greece with code GR is not using a translated title, which is totally throwing off the way my script works. So to avoid problems I need to switch from a blacklist to a whitelist method, meaning that I need to find out all the 2-letter country codes where IMDB uses locale titles. This is most likely a long list, and I know already it will include DE, ES, IT, etc. but it will take time to get help from other MovingPictures users to build this list.

    So for the time being I will keep using the blacklist method and replace UK for GB, and add GR (Greece) and NO (Norway) code as well.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #20
    Re: IMDb+ Scraper (short/long summary, imdb/RT score, US/UK rating, and more)

    Paste2: Next Generation Pastebin - Viewing Paste 1269399

    Thank you for fixing that. It is very annoying.:D

    Ok this is frustrating.

    IMDb does translate the title into Greece (lol, I hope it is Greece).

    What my script sees on your HTML code is:

    Main Title = Oi epomenes treis meres
    Original Title = The Next Three Days
    Also known as = Les trois prochains jours

    My script doesn't know what language is actually used, so it has to make big assumptions. In this case when the country code is *NOT* an English speaking country it will assume that the main title is translated in local language, so it takes the original title.

    Now when I test my script with Expresso on your HTML code, the result I get is "The Next Three Days", which is the correct English title. And you are saying you get the AKA French one?
     

    Users who are viewing this thread

    Top Bottom