Combining Foreign + English title doesn't work (2 Viewers)

Hes

Portal Member
December 7, 2009
9
0
Home Country
Germany Germany
I'm situated in Germany, but my Windows and MediaPortal is set to English. I'd like to get all movies in MovingPictures listed with their original-language titles, but for a while now get all new additions with german titles, which is very annoying. I assume that the IMDb delivers a local german version of the html pages, so I thought I post a few examples to get to the bottom of this.

The first file contains my scraper settings.

Then, I tried to re-import "Abraham Lincoln Vampire Hunter" which is shown in MP as "Abraham Lincoln Vampirjäger". I've included the movingpictures log file, the source for the imdb page, for the akas.imdb page (which looks exactly like the previous page), and the releaseinfo page as seen when using Firefox.

No provide a non english-language movie, I also tried "De Rouille et d'os" (english title "Rust and Bones"), which is shown as "Der Geschmack von Rost und Knochen". Again, I provided the log file plus the three imdb page sources.

I hope this helps to get this working again. Thanks very much for your efforts.

Cheers, Michael.
 

Attachments

  • Options IMDb+ Scraper.xml
    1.7 KB

vpupkin

Portal Pro
March 26, 2011
84
8
Actually, I was too quick with the confirmation - while titles indeed do work, I noticed that Actors/Directors/etc are not populated (all are empty). This used to work before the recent updates to iMDB+...

It is more related to the updates Amazon keeps doing to the IMDb websites. But I'm unable to duplicate, only testing on Actors right now, but it works fine for me. Can you give me HTML source code on a movie you know does not work? Preferably the www.imdb, and akas.imdb page.


This is weird, perhaps indeed Amazon-related. Now when I update the same movies, I get all the actors/directors etc that were missing. However, english/foreign titles stopped working again (the same ones that were fixed with the latest update). There were no changes on my end; I'm on 4.9.19...
 

Attachments

  • Rust and Bone (2012) - IMDb.zip
    48.2 KB

RoChess

Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    I'm situated in Germany, but my Windows and MediaPortal is set to English. I'd like to get all movies in MovingPictures listed with their original-language titles, but for a while now get all new additions with german titles, which is very annoying. I assume that the IMDb delivers a local german version of the html pages, so I thought I post a few examples to get to the bottom of this.

    Disable the IMDb+ option to 'use original title'

    This *ONLY* works for USA users of IMDb+, because Amazon/IMDb translates the title to your localized language, so what IMDb+ sees as foreign/original movie-title all of a sudden is lost. There is a way for me to possibly obtain that, but I need to totally rewrite the entire IMDb+ detection system for that, and it took me months to create the existing one. Simply don't have the time for it right now, but once I get back from work conference next month it is on my ToDo list.

    If you disable IMDb+ to not use original title, then it knows that the Amazon/IMDb title is not the "English" one, and it will use the USA title from AKA page. In your case the 'original title' detection thinks that "Abraham Lincoln Vampirjäger" is the original English title and skips the AKA page.

    Looking at the existing logic, I might actually be able to come up with a trick to make it all work again, so you can use the original title setting. In order for me to do that properly I need the following:

    The aka.imdb HTML source of as many movie combinations as possible. You will need to do this inside Internet Explorer, because that is the engine MovPic uses. I'll need movies that show the correct English title for you as-is (because there is no German translated title for it), and I need a foreign title movie (such as Zwartboek) that has German translated title, and I need a foreign title movie that doesn't have a German translated title, and finally I need a German movie that has an English translated title.

    Get me that and I might actually be able to come up with a quick-n-dirty solution in short term, versus a few months full rewrite delay.
     

    Hes

    Portal Member
    December 7, 2009
    9
    0
    Home Country
    Germany Germany
    Disable the IMDb+ option to 'use original title'

    I have done so now and re-imported; I now shows the english title first and then (in parentheses) the german title.

    This *ONLY* works for USA users of IMDb+, because Amazon/IMDb translates the title to your localized language, so what IMDb+ sees as foreign/original movie-title all of a sudden is lost. There is a way for me to possibly obtain that, but I need to totally rewrite the entire IMDb+ detection system for that, and it took me months to create the existing one. Simply don't have the time for it right now, but once I get back from work conference next month it is on my ToDo list.

    I can understand that completely; having done a bit of web scaping myself I know that dealing with pages that get tinkered with a lot is a nightmare.

    Looking at the existing logic, I might actually be able to come up with a trick to make it all work again, so you can use the original title setting. In order for me to do that properly I need the following:

    The aka.imdb HTML source of as many movie combinations as possible. You will need to do this inside Internet Explorer, because that is the engine MovPic uses. I'll need movies that show the correct English title for you as-is (because there is no German translated title for it), and I need a foreign title movie (such as Zwartboek) that has German translated title, and I need a foreign title movie that doesn't have a German translated title, and finally I need a German movie that has an English translated title.


    Okay, I have collected samples for
    • Décalage Horaire (which has the german title "Jet Lag oder Wo die Liebe hinfliegt")
    • Zwartboek (shown as "Black Book")
    • Serbian movie "Parada" (which has no german title)
    • French movie "Alceste à Bicyclette" (again without a german title)
    • US movie "Man of Steel" (no german title)
    • US movie "Transformers" (no german title)
    • US movie "Dan in Real Life" (shown in Germany as "Dan - mitten im Leben!")
    • German Movie "Wir sind die Nacht" (international title "We are the Night")
    • German Movie "Groupies bleiben nicht zum Frühstück" (no international title, was shown in the Netherlands as "Single by Contract")
    If you need more examples just tell me what to look for.

    Thanks, Michael.
     

    Attachments

    • imdbsamples.zip
      409.7 KB

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    @Hes, ok, fixed all the RegExp codes, so full English title detection is working again, even when "(original title)" is missing from AKA page. Of course Amazon can not be as kind and let me know the language of the title. If they could just add "(original English title)" then things would be sooooo much easier.

    I actually thought I had an idea, but then I come across 2 of the movies you provided HTML sources on. For example "Alceste à bicyclette" doesn't even have that title on the AKA page, so there is no way for me to detect language of title then. I can deduct that from the details page though, as country+language give that away, the same way I currently detect English titles. There are just too many country+language combinations then. It is already a pain in the .... to do this for just English, let alone if I'm going to support every other language.

    Still if the title on the IMDb details page does not exist on the AKA page as USA/World-wide-English I can make rough assumptions that it is a foreign title that was not translated. It is supposed to put that as "(original title)" then for you after English title, but Amazon never makes it easy on me. I just hope I can account for all the combinations with a little bit of additional code, otherwise it is going to become a maintenance nightmare.
     

    vpupkin

    Portal Pro
    March 26, 2011
    84
    8
    Thanks RoChess, the latest update mostly works, I tried it on my last ~15 movies, and english/foreign, as well as actors/directors/etc seem to work. One non-foreign movie title is now imported incorrectly though (used to work fine): Singin' in the rain (1952). The title imports, but now it is in parenthesis:

    Singin' in the rain -> (Singin' in the rain)
     

    Attachments

    • Singin' in the Rain (1952) - IMDb.zip
      47.8 KB

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Thanks RoChess, the latest update mostly works, I tried it on my last ~15 movies, and english/foreign, as well as actors/directors/etc seem to work. One non-foreign movie title is now imported incorrectly though (used to work fine): Singin' in the rain (1952). The title imports, but now it is in parenthesis:

    Singin' in the rain -> (Singin' in the rain)

    Can you give me the scraper-debug enabled MovPic log file when you reimport that one?

    I'm running all title regular expressions on the HTML sources you send, and none of them would cause parenthesis.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Try v4.9.21, I have to account for empty titles now for some movies with this new Amazon/IMDb system.

    Let me know if that works.
     

    vpupkin

    Portal Pro
    March 26, 2011
    84
    8
    I know we can always count on RoChess ;) 4.9.21 seems to work fine again - I didn't do extensive testing, but on the several titles that had different problems in the last couple of weeks all issues seem to be resolved.

    Thanks!
     

    Users who are viewing this thread

    Top Bottom