IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.7 (2 Viewers)

Should this be the default imdb scraper?

  • Yes, I do not want to re-import

    Votes: 19 95.0%
  • No, keep this one seperate

    Votes: 0 0.0%
  • Who cares, I got movies to watch

    Votes: 1 5.0%

  • Total voters
    20
  • Poll closed .

RoChess

Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #91
    Re: IMDb+ Scraper (Force English titles, Auto-Rename titles to group, and more) v3.1.

    What does the following line do? Looks like english speaking nations, but can't find any info on what the line actually does.

    <set name="global_options_country_filter" value="us|gb|ie|au|nz" />

    Does it do the same as the

    <set name="global_options_language_filter" value="no|sv|qae|da|en|us"

    But use the country tag instead of language? Do I have to use both, or can I delete one?

    As explained in the scraper comment above it via:

    Code:
    The country filter was added to avoid mistakes on foreign movies in English

    It was not enough to filter on 'just' the language to find the right movies, this is because they sometimes would release an English language movie in for example Japan (ja), but then use the foreign title. Thanks for posting that though, it made me verify my personal scraper versus the one I upload to first post and I notice that Canada (ca) is missing. You can manually add |ca| to the country_filter settings for now, or wait till I release v3.1.5 later this month (still trying to optimize).

    As for the rename dBase, nice catch on the remake, I'll also add some of your other entries. I see that you had to add "City of the Living Dead" to it as well, because it is an English language movie released in Italy with the correct title. Adding |it| to the country_filter would solve that, however it might break other movies. My sample set on Italian movies is too small to verify this, so hopefully you can help me with this. However the AKA title for USA/English is correct, so I wonder why you had to add an entry to force rename, will have to debug that movie.

    For the Bachelor Party series, the original part 1 is the version from 1984 with Tom Hanks, you can verify this in the movie connections page at the imdb.com site, so I took the liberty of correcting this for v0.6 of the Rename dBase (check first post). This version also contains the adjustments to have every movie series use Roman Numerals and retain the correct order on 9+ via the SortBy field.

    I'll hold off on v3.1.5 of the scraper with the |ca| change, incase you can help me find out if adding |it| makes sense as well. Maybe I should work the other way around and use |ja| as a blacklist, but then I'll probably end up with a large list as well for China, Korea, Taiwan, and other countries that might release an English language movie with the wrong title. So help me test please :)
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (Force English titles, Auto-Rename titles to group, and more) v3.1.

    What do you people think is the best solution for Romero's Zombie series?

    The original US movies:

    tt0063350 - Night of the Living dead (1968)
    tt0077402 - Dawn of the Dead (1978)
    tt0088993 - Day of the Dead (1985)

    The there is:

    tt0418819 - Land of the dead (2005)
    tt0848557 - Diary of the dead (2007)
    tt1134854 - Survival of the Dead (2009)

    Then there are remakes:

    tt0100258 - Night of the Living Dead (1990)
    tt0363547 - Dawn of the Dead (2004)
    tt0489018 - Day of the Dead (2008)

    And then there is a 2nd remake, which isn't really a remake but more of an homage.

    tt0489244 - Night of the Living Dead 3D

    And if I'm not mostaken there are some italian movies aswell (Zombi 2 and Zombi 3)

    As explained in the scraper comment above it via:

    Code:
    The country filter was added to avoid mistakes on foreign movies in English

    It was not enough to filter on 'just' the language to find the right movies, this is because they sometimes would release an English language movie in for example Japan (ja), but then use the foreign title. Thanks for posting that though, it made me verify my personal scraper versus the one I upload to first post and I notice that Canada (ca) is missing. You can manually add |ca| to the country_filter settings for now, or wait till I release v3.1.5 later this month (still trying to optimize).

    Thank you. I came to that conclusion in the early morning hours aswell. I figgured I had to ad Ca aswell when I was trying to import Meatballs (1979), which turns out to be a Canadian movie. Who knew? :)

    One thing I've noticed though is that I now have to add |dk| under "country and |de| under language to get "Blinkende Lygter" as "Blinkende Lygter" and not "Flickering Lights"

    http://www.imdb.com/title/tt0236027/

    Something else I've noticed is that IMDB sometimes uses different shortcuts for language and country.

    Take Denmark: They use |de| for language and |dk| for country. It's not a problem, but something people should be aware of.

    As for the rename dBase, nice catch on the remake, I'll also add some of your other entries. I see that you had to add "City of the Living Dead" to it as well, because it is an English language movie released in Italy with the correct title. Adding |it| to the country_filter would solve that, however it might break other movies. My sample set on Italian movies is too small to verify this, so hopefully you can help me with this. However the AKA title for USA/English is correct, so I wonder why you had to add an entry to force rename, will have to debug that movie.

    It worked out in the end, don't know what was wrong, I had the problem where it got some weird names for a couple of moves like "Hero" was "Quentin Tarantino presents Hero" or something like that, but I did a new import and then it worked fine.

    For the Bachelor Party series, the original part 1 is the version from 1984 with Tom Hanks, you can verify this in the movie connections page at the imdb.com site, so I took the liberty of correcting this for v0.6 of the Rename dBase (check first post). This version also contains the adjustments to have every movie series use Roman Numerals and retain the correct order on 9+ via the SortBy field.

    Thanks for pointing this out, I knew there was some sort of system that linked movies on IMDB, but couldn't find it so I went with the one that came out a couple of years earlier.

    I'll hold off on v3.1.5 of the scraper with the |ca| change, incase you can help me find out if adding |it| makes sense as well. Maybe I should work the other way around and use |ja| as a blacklist, but then I'll probably end up with a large list as well for China, Korea, Taiwan, and other countries that might release an English language movie with the wrong title. So help me test please :)

    Well, for me adding the |it| works, but that's because I prefer US over italian titles, I have 2 other movies that are "italian"

    tt0086135 - I predatori di Atlantide (1983) - Which is the same case as "City of the Living Dead" English movie released in Italy
    tt0284717 - Crusadi (2001) - Which I believe is an actual italian TV-movie, atleast it's made by RAI which is an italian TV station, but the audio on my version is English, and it's listed as English language on imdb aswell.

    But like I said for me personally I have no problem adding |it| but italians might not like what that does.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #93
    Re: IMDb+ Scraper (Force English titles, Auto-Rename titles to group, and more) v3.1.

    tt0284717 - Crusadi (2001) - Which I believe is an actual italian TV-movie, atleast it's made by RAI which is an italian TV station, but the audio on my version is English, and it's listed as English language on imdb aswell.

    Thank you for that example, it is an English language movie, first released in Italy (usually means it was produced there), but the title is Italian. So adding |it| to the country_filter would then cause a problem for these titles, unless of course you prefer the Italian title for this movie.

    It's going to be a problem either way, so I'll leave |it| out of the default version, but you can add it yourself. Same reason that it is encouraged you edit the filters if you prefer to retain Norwegian titles for Norwegian productions (this was the original reason for these filters), as you did for yourself.

    I'm already feeling that I am borderlining on semi-AI to find the English title via all these if-then-else checks, but I ended up adding another one to search for "USA (imdb display title)" first. This solves your "City of the Living Dead" problem without having to add Italy to country filter list.

    I had a nice collection of movie filenames that would allow me to verify these problematic scenarios and there was one movie I used to test the change that I made to look for USA title from bottom-up on the AKA title page. Unfortunatly when I upgraded to MediaPortal v1.2, I lost that collection of files, so now I can't verify if this new change I made in v3.1.5 does not break it for those other special cases. Guess we will find out :D
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    I've created a folder of test movies that contains the following movies:

    ARN - Riket vid vägens slut (Swedish)
    ARN - Tempelriddaren (Swedish)
    Blinkende Lygter (Danisk)
    The Raiders of Atlantis (English movie from Italy)
    Lange Flate Ballær (Norwegian)
    Lange Flate Ballær II (Norwegian)
    City of the Living dead (English movie from Italy)
    Crusaders (English movie from Italy)

    And here are the results:

    <set name="global_options_country_filter" value="us|gb|ie|au|nz|ca|no|se|dk" />
    <set name="global_options_language_filter" value="en|no|sv|da" />

    ARN - Riket vid vägens slut
    ARN - Tempelriddaren
    Blinkende Lygter
    The Raiders of Atlantis
    Lange Flate Ballær
    Lange Flate Ballær II
    Twilight of the dead
    Crusaders




    <set name="global_options_country_filter" value="it|us|gb|ie|au|nz|ca|no|se|dk" />
    <set name="global_options_language_filter" value="en|no|sv|da" />


    ARN - Riket vid vägens slut
    ARN - Tempelriddaren
    Blinkende Lygter
    I predatori de Atlantide *
    Lange Flate Ballær
    Lange Flate Ballær II
    Paura nella citta del morti viventi *
    Crociati *

    The first one looks perfect exept from the fact that "City of the Living Dead" is named "Twilight of the dead" which is the US pre-release title.

    It'd prefer it to be "City of the Living Dead" which on IMDB is listed as International (English title) / USA (imdb display title)

    Is there a way to fix that other then in the rename file?

    Is the fix you mentioned in your last post meant to solve just that?

    http://www.imdb.com/title/tt0081318/releaseinfo#akas
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #95
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    It'd prefer it to be "City of the Living Dead" which on IMDB is listed as International (English title) / USA (imdb display title)

    Is there a way to fix that other then in the rename file?

    Grab v3.1.5 from first post, that should fix it :)
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    Thank you, my test movies are now just the way I like em :)

    Now I'll spend some time trying to find out how to add the "Living Dead" series in the renamer :)

    Here is my first suggestion:

    Code:
    <rename id="tt0063350" title="Living Dead I: Night of the Living Dead" />
    <rename id="tt0077402" title="Living Dead II: Dawn of the Dead" />
    <rename id="tt0088993" title="Living Dead III: Day of the Dead" />
    <rename id="tt0418819" title="Living Dead IV: Land of the Dead" />
    <rename id="tt0848557" title="Living Dead V: Diary of the Dead" />
    <rename id="tt1134854" title="Living Dead VI: Survival of the Dead" />
    
    <rename id="tt0100258" title="Living Dead (remake) I: Night of the Living Dead" />
    <rename id="tt0363547" title="Living Dead (remake) II: Dawn of the Dead" />
    <rename id="tt0489018" title="Living Dead (remake) III: Day of the Dead" />
    
    <rename id="tt0489244" title="Living Dead (homage) I: Night of the Living Dead 3D" />


    For the Che movies:

    Code:
    	<rename id="tt0892255" title="Che Part I: The Argentine" />
    	<rename id="tt0374569" title="Che Part II: Guerrilla" />

    For American Pie Movies:

    Code:
    	<rename id="tt0163651" title="American Pie I: American Pie" />
    	<rename id="tt0252866" title="American Pie II: American Pie II" />
    	<rename id="tt0328828" title="American Pie III: American Wedding" />
    	<rename id="tt0436058" title="American Pie IV: Band Camp" />
    	<rename id="tt0808146" title="American Pie V: The Naked Mile" />
    	<rename id="tt0974959" title="American Pie VI: Beta House" />
    	<rename id="tt1407050" title="American Pie VII: The Book of Love" />

    Aparently Bachelor Party II is an American Pie movie aswell, but the 7 movies above are all "directly" related to the original triology.

    Bill & Ted:

    Code:
    	<rename id="tt0096928" title="Bill &amp; Ted I: Bill &amp; Ted's Excellent Adventure" />
    	<rename id="tt0101452" title="Bill &amp; Ted II: Bill &amp; Ted's Bogus Journey" />[/


    edit:

    Had a problem with the '&' in Bill & Ted lines, but fixed thatns to SilentException.

    Urban Legend movies:

    Code:
    	<rename id="tt0146336" title="Urban Legend I" />
    	<rename id="tt0192731" title="Urban Legends II: Final Cut" />
    	<rename id="tt0451957" title="Urban Legends III: Bloody Mary" />

    Monthy Python movies

    Code:
    	<rename id="tt0071853" title="Monthy Python 1: Monthy Python and the Holy Grail" />
    	<rename id="tt0079470" title="Monthy Python 2: Life of Brian" />
    	<rename id="tt0085959" title="Monthy Python 3: The Meaning of Life" />

    The Crow movies

    Code:
    	<rename id="tt0109506" title="The Crow I: The Crow" />
    	<rename id="tt0115986" title="The Crow II: City of Angels" />
    	<rename id="tt0132910" title="The Crow III: Salvation" />
    	<rename id="tt0353324" title="The Crow IV: Wicked Prayer" />
    	<rename id="tt0946992" title="The Crow V: Purgatory" />
    	<rename id="tt1347027" title="The Crow VI: Purgatory 2" />


    Olsenbanden (Norwegian movies and titles, so probably not going in the official file)
    Code:
    	<rename id="tt0129273" title="Olsenbanden I: Operasjon Egon" sortby="Olsenbanden 01"/>
    	<rename id="tt0132376" title="Olsenbanden II: Olsenbanden og Dynamitt-Harry" sortby="Olsenbanden 02" />
    	<rename id="tt0132378" title="Olsenbanden III: Olsenbanden tar gull" sortby="Olsenbanden 03"/>
    	<rename id="tt0132377" title="Olsenbanden IV: Olsenbanden og Dynamitt-Harry går amok" sortby="Olsenbanden 04"/>
    	<rename id="tt0132384" title="Olsenbanden V: Olsenbanden Møter Kongen og Knekten" sortby="Olsenbanden 05"/>
    	<rename id="tt0132379" title="Olsenbanden VI: Olsenbandens siste bedrifter" sortby="Olsenbanden 06"/>
    	<rename id="tt0132382" title="Olsenbanden VII: Olsenbanden For Full Musikk" sortby="Olsenbanden 07"/>
    	<rename id="tt0132380" title="Olsenbanden VIII: Olsenbanden og Dynamitt-Harry på Sporet" sortby="Olsenbanden 08"/>
    	<rename id="tt0132381" title="Olsenbanden IX: Olsenbanden og Data-Harry Sprenger Verdensbanken" sortby="Olsenbanden 09"/>
    	<rename id="tt0079661" title="Olsenbanden X: Olsenbanden og Dynamitt-Harry Mot Nye Høyder" sortby="Olsenbanden 10"/>
    	<rename id="tt0132383" title="Olsenbanden XI: Olsenbanden Gir Seg Aldri" sortby="Olsenbanden 11"/>
    	<rename id="tt0132385" title="Olsenbanden XII: Olsenbandens Aller Siste Kupp" sortby="Olsenbanden 12"/>
    	<rename id="tt0087710" title="Olsenbanden XIII: Men Olsenbanden Var Ikke Død" sortby="Olsenbanden 13"/>
    	<rename id="tt0195960" title="Olsenbanden XIV: Olsenbandens Siste Stikk" sortby="Olsenbanden 14"/>

    Varg Veum movies, norwegian movies and names, so probably not going in the official list

    Code:
    	<rename id="tt0844948" title="Varg Veum I: Bitre Blomster" sortby="Varg Veum 01" />
    	<rename id="tt1181936" title="Varg Veum II: Tornerose" sortby="Varg Veum 02" />
    	<rename id="tt1181935" title="Varg Veum III: Din til Døden" sortby="Varg Veum 03" />
    	<rename id="tt1010265" title="Varg Veum IV: Falne Engler" sortby="Varg Veum 04"/>
    	<rename id="tt1135937" title="Varg Veum V: Kvinnen i Kjøleskapet" sortby="Varg Veum 05" />
    	<rename id="tt1296458" title="Varg Veum VI: Begravede Hunder" sortby="Varg Veum 06"/>
    	<rename id="tt1572783" title="Varg Veum VII: Skriften på Veggen" sortby="Varg Veum 07"/>
    	<rename id="tt1699164" title="Varg Veum VIII: Svarte Får" sortby="Varg Veum 08"/>
    	<rename id="tt1699163" title="Varg Veum IX: Dødens Drabanter" sortby="Varg Veum 09" />
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    Thanks, that did it.
     

    digitalfm

    Portal Pro
    February 4, 2008
    114
    18
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    Brilliant work RoChess,

    I was simply looking for an updated scraper that pulls in UK certs and found this. I ran it on 170 movies and it found every one of them.

    Superb work!!

    I agree, it would be so good if a hybrid version of this and the current IMDB scraper was packaged by default with Moving Pitcures with some simple options such as the one's in your global options being part of the config menu.

    For now though importing the script is not a problem, given what you get out of it. :D
     

    zicoz

    MP Donator
  • Premium Supporter
  • September 3, 2006
    896
    63
    Home Country
    Norway Norway
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.5

    Planet of the Apes:

    Code:
    	<rename id="tt0063442" title="Planet of the Apes I: Planet of the Apes " />
    	<rename id="tt0065462" title="Planet of the Apes II: Beneath the Planet of the Apes" />
    	<rename id="tt0067065" title="Planet of the Apes III: Escape from the Planet of the Apes " />
    	<rename id="tt0068408" title="Planet of the Apes IV: Conquest of the Planet of the Apes" />
    	<rename id="tt0069768" title="Planet of the Apes V: Battle for the Planet of the Apes" />
    	<rename id="tt0133152" title="Planet of the Apes (remake) I: Planet of the Apes " />
    	<rename id="tt1318514" title="Planet of the Apes VI: Rise of the Planet of the Apes" />

    I'm not sure if the new film this summer is really a number 6, it's the origin story, but I don't know if it's a remake of an earlier movie or not.

    Death Race

    Code:
    	<rename id="tt0072856" title="Death Race I: Death Race 2000"/>
    	<rename id="tt0452608" title="Death Race II: Death Race "/>
    	<rename id="tt1500491" title="Death Race III: Death Race 2"/>

    Not sure if we should include the Death Race 2000 movie in the seris, but since all the Batman movies are in the same series, I came to the conclusion that I should include it.

    NeverEnding Story:

    Code:
    	<rename id="tt0088323" title="The NeverEnding Story I: The NeverEnding Story"/>
    	<rename id="tt0100240" title="The Neverending Story II: The Next Chapter"/>
    	<rename id="tt0110647" title="The Neverending Story III: Escape from Fantasia"/>

    Internal Affairs:

    Code:
    	<rename id="tt0338564" title="Internal Affairs I"/>
    	<rename id="tt0369060" title="Internal Affairs II"/>
    	<rename id="tt0374339" title="Internal Affairs III: End Inferno"/>


    Mesrine:
    Code:
    	<rename id="tt1259014" title="Mesrine I: Killer Instinct"/>
    	<rename id="tt0411272" title="Mesrine II: Public Enemy no. 1"/>

    Lost Boys:

    Code:
    	<rename id="tt0093437" title="Lost Boys I: The Lost Boys"/>
    	<rename id="tt1031254" title="Lost Boys II: The Tribe"/>
    	<rename id="tt1400526" title="Lost Boys III: The Thirst"/>



    Ong Bak
    Code:
    	<rename id="tt0368909" title="Ong Bak I: Ong Bak"/>
    	<rename id="tt0785035" title="Ong Bak II: The Beginning"/>
    	<rename id="tt1653690" title="Ong Bak III: The Final Battle"/>
     

    Users who are viewing this thread

    Top Bottom