IMDb English Title Scraper (1 Viewer)

GazpachoKing

Portal Pro
February 8, 2006
75
28
I posted this in another thread before, but I updated it a little, so I decided to give it it's own thread.

The database does not store the extra info for the alternate titles, so it is not possible to use this information after the scraper scrapes. I have created an alternate scraper that should grab the English title for movies.

I haven't done a lot of testing, but here is How it should work:
If the 'Language' field for the movie is anything but English it will search for the first alternate title containing either 'USA' or 'English title' and assign that as the main title for the movie. When you are picking the appropriate movie from the list of possible matches, it will still show up with the original IMDb title, however once you select it it should name your movie properly in English.

I was running into a problem where it would set the 'sort by' field to the original foreign title, so I started hard setting that in the scraper as well (stripping the preceding 'the' 'a' and 'an') I didn't see any other scrapers manually setting the sort by field, so I'm not sure why my version had to do it manually.

I am attaching 2 versions, one which pulls the full summary, and one which pulls the short plot. Both use the new English alternate title. If anyone runs into problems let me know and I will take a look.

Change log:

1.0.1
Fixed a bug that would let it remove 'the' from the beginning of a title, even if it was part of a word, i.e. 'them'. Also, it now strips 'a' and 'an' from the sort by field.

1.0.2
Fixed a bug that would delete the title of a foreign movie if it did not have an English name.

1.0.3
Actually fix bug that I thought I fixed in 1.0.2
 

Attachments

  • IMDbEnglishTitle.1.0.3.xml
    30.7 KB
  • IMDbSmallPlotEnglishTitle.1.0.3.xml
    30.7 KB

awatrin

MP Donator
  • Premium Supporter
  • December 7, 2008
    8
    0
    Home Country
    Brazil Brazil
    Hi GazpachoKing, is it possible to use this scraper to grab the title and summary in another language (eg: portuguese) ? Wich lines do I have to change?

    Thanks
     

    clahti

    Portal Member
    November 19, 2008
    27
    2
    Hi GazpachoKing, I am wondering if the following scenario would be ideal, if the original movie title is not in English then grab the USA English movie title and stuff this into the title field, but then take the original non-english title and place this in the alternate title field. This would be good for people like me who have a wife that loves foreign movies and can actually read the original title and people like me who need to see the english title for lack of language skills :) That way the skin could be modified to show both values, but keep the english title as the default.
     

    GazpachoKing

    Portal Pro
    February 8, 2006
    75
    28
    Hi GazpachoKing, is it possible to use this scraper to grab the title and summary in another language (eg: portuguese) ? Wich lines do I have to change?

    Thanks

    Yah, that'll work, there are 2 lines you need to modify.
    change this line:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="((USA)|(English title))" />
    To include the languages you want to overide the movie's original name e.g.:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="(Portugal)" />
    You also need to change this line:
    <if test="${movie.language}!=English">
    To:
    <if test="${movie.language}!=Portuguese">

    I haven't tried it though, so I'm not sure how well it will work.

    clahti said:
    I am wondering if the following scenario would be ideal, if the original movie title is not in English then grab the USA English movie title and stuff this into the title field, but then take the original non-english title and place this in the alternate title field.

    I think if you add this line:
    <set name="movie.alternate_titles" value="${movie.title}"/>
    Right after the line:
    <if test="${movie.language}!=English">
    It should do what you describe.
     

    awatrin

    MP Donator
  • Premium Supporter
  • December 7, 2008
    8
    0
    Home Country
    Brazil Brazil
    Hi GazpachoKing, is it possible to use this scraper to grab the title and summary in another language (eg: portuguese) ? Wich lines do I have to change?

    Thanks

    Yah, that'll work, there are 2 lines you need to modify.
    change this line:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="((USA)|(English title))" />
    To include the languages you want to overide the movie's original name e.g.:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="(Portugal)" />
    You also need to change this line:
    <if test="${movie.language}!=English">
    To:
    <if test="${movie.language}!=Portuguese">

    I haven't tried it though, so I'm not sure how well it will work.


    Thanks GazpachoKing, I followed your suggestion and made other changes too, and now it's fully working to get the Brazilian titles. Thanks for your help!

    Arthur
     

    JSorrentino

    Portal Pro
    September 30, 2008
    90
    4
    Home Country
    United States of America United States of America
    Works perfect, thank you!

    For others: If you currently have movies in your DB with foreign names and short descriptions, like I did - I had to ignore the movies, then go to the importer and unignore movies - this got them to re-scrape and get the new information. There may be an easier way, but this worked for me.

    Thanks for the work and time you put in on this!
     

    ronysrei

    Portal Member
    May 15, 2009
    19
    4
    Home Country
    Brazil Brazil
    Hi GazpachoKing, is it possible to use this scraper to grab the title and summary in another language (eg: portuguese) ? Wich lines do I have to change?

    Thanks

    Yah, that'll work, there are 2 lines you need to modify.
    change this line:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="((USA)|(English title))" />
    To include the languages you want to overide the movie's original name e.g.:
    <parse name="cntry" input="${currAka[1]:htmldecode}" regex="(Portugal)" />
    You also need to change this line:
    <if test="${movie.language}!=English">
    To:
    <if test="${movie.language}!=Portuguese">

    I haven't tried it though, so I'm not sure how well it will work.


    Thanks GazpachoKing, I followed your suggestion and made other changes too, and now it's fully working to get the Brazilian titles. Thanks for your help!

    Arthur


    Hi GazpachoKing. I also made the changes to the scraper to get the titles in Portuguese Brazilian. It's working really nice! Thank you!

    Is there, by any chance, a way to get the English title if the Brazil (Portuquese) title is not available?

    If not available, Moving Pictures is showing $(movie.title).

    I appreciate it,

    Ron
     

    GazpachoKing

    Portal Pro
    February 8, 2006
    75
    28
    Is there, by any chance, a way to get the English title if the Brazil (Portuquese) title is not available?

    If not available, Moving Pictures is showing $(movie.title).

    I appreciate it,

    Ron

    Hmm, that is what it should already do.. $(movie.title) is the variable name for the title of the movie, it should be replaced with the actual title. Not sure why it's not doing that.
     

    Users who are viewing this thread

    Top Bottom