IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.7 (1 Viewer)

Should this be the default imdb scraper?

  • Yes, I do not want to re-import

    Votes: 19 95.0%
  • No, keep this one seperate

    Votes: 0 0.0%
  • Who cares, I got movies to watch

    Votes: 1 5.0%

  • Total voters
    20
  • Poll closed .

RoChess

Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: AW: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3

    Hi RoChess,

    I like this scraper, it works a lot better than the standard imdb scraper. Tested it with 58 movies and it got all of them correct without any assistance or corrections necessary (where as the original scraper got nine movies wrong).

    However as I am german I'd like to use ofdb for titles and summaries. I made a small program to update the rename xml file and added some attribute fields for the ofdb id and german titles, then copy pasted the renaming part from the imdb+ scraper to the MoPi ofdb scraper. This however seems to greatly confuse the ofdb scraper, as it stops working completely :(

    Could you give some hints and tips as to what needs to be adjusted to get the ofdb scraper working with the renamer?

    Thank you, I've put a lot of work into it so far, but the results on my own HTPC are worth it.

    As far a integrating my code into a different system, if the 'other' scraper also uses IMDb tt-IDs as a reference, then it's almost as easy as copy and paste. But if you adjusted it from IMDb tt-ID into OFDb IDs then more changes are needed and I guess that is where things went wrong. What I did with the IMDb+ scraper goes way outside of the scope of scrapers, so a lot of additional conditions have to be met. For example the rename database file has to be in the location defined by the scraper script with the same filename, which I used file="C:\Rename dBase IMDb+ Scraper.xml" for. This file has to be proper XML syntax, so perhaps your rename system messed that up. Use the XML synax checker inside the comments section to verify. You can copy and paste the content of the XML file into the online form and check.

    The rest is then pretty logical, I load the contents of the XML file into an array called rename_array. And then via the @nodes I compare the @id (which I use to store the IMDb tt-ID with inside XML file) to the movie.imdb_id, and on a match rename the title and/or sortby field via their respective @nodes. To allow for a change on title, sortby or title+sortby, I added the extra empty string check.
     

    maximm

    Portal Member
    July 24, 2011
    21
    3
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.6

    Hi, I tried the scraper, but when getting IMDB scores, i still get alot of zeros, and most scores are incorrect. Am i doing something wrong?

    My bad, checked the script and im getting the rottentomatoes.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.6

    Hi, I tried the scraper, but when getting IMDB scores, i still get alot of zeros, and most scores are incorrect. Am i doing something wrong?

    My bad, checked the script and im getting the rottentomatoes.

    I was still working on some major improvements to the scraper, so when I caught the rating issue on a few movies I figured I could just combine it all into a single release. But the time to test the new features is taking longer then expected, and since it affects more movies then I thought it did, I've released v3.1.7 to solve that for the time being.

    With the new version that I'm working on, you will be able to adjust the global_options on the fly via an XML file, so that re-importing of the script in scraper-debug mode is no longer needed. It will also support conditional updates, meaning that on refresh of movie data you can configure the scraper to only update missing info and force update the scores+votes. This way a refresh will not only go faster, but will retain any custom changes you made to title, summary, etc.
     

    Merlyn

    Portal Pro
    July 8, 2011
    250
    322
    Home Country
    Germany Germany
    AW: IMDb+ Scraper (Force English title, Auto-Rename titles to group, and more) v3.1.7

    Thanks or your reply, RoChess :)

    I havent found, what was causing the issues first, but after a good nights sleep and a restart of mp config it was working fine...

    However, since I was in my first attempts relying a lot on the rename xml file ( had added the ofdb id and the summary from ofdb as fields to the file), I did some more changes.

    I successfully managed to translate the imdb id to the ofdb id and read the movie title and summary from ofdb. And while I am writing this on one monitor and have the code on the other, I noticed, that after the correct title was set from ofdb it is again set to the title from the rename xml in the rename section... need to fix this... Does work atm cause I changed all the titles in the rename xml to the correct german title...

    Anyway, the point is, that I now can get the german titles and summaries if available on ofdb.org.

    Next thing, that bothers me and that I plan to change, is that for correct movie sorting a valid rename xml is required. I noticed, that imdb has the necessary info to correctly sort movies already on the movieconnections subpage. So I'll try and see, if I can get correct sorting and make the rename db optional. So that if a movie is not in the rename xml (it only has some 700+ movies right now) the info is parsed from imdb directly...

    Need to learn and understand regex first, though...

    Note on attachments: These are rather dirty hacks! Use on your own risk! That they work quite well for me does not mean, they will not mess up your database! Make a backup before using! Changed IMDB+ scraper version and date to 3.1.8 on 07/24/2011... Also not for public. only eval for RoChess.
     

    Attachments

    • IMDb+ v3.1.8.xml
      37 KB
    • renamedb de.xml
      87.5 KB

    Furetto

    Moderator - Dutch Forums
    April 11, 2005
    664
    61
    50
    Brussels
    Home Country
    Belgium Belgium
    Found a small error, the IMDB id for Art of War II is wrong:
    <rename id="tt0123357" title="The Art Of War II: Betrayal" />
    Should be
    <rename id="tt1233571" title="The Art Of War II: Betrayal" />
    Otherwise you get "The people's court"
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Found a small error, the IMDB id for Art of War II is wrong:
    <rename id="tt0123357" title="The Art Of War II: Betrayal" />
    Should be
    <rename id="tt1233571" title="The Art Of War II: Betrayal" />
    Otherwise you get "The people's court"

    Thank you, hate it when typos sneak in.

    It will be part of v0.9 of the rename XML, which I'll release when v3.2.x is done. The users that are freaking out now can fix it themselves :D

    That is going to be a feature of v3.2.x is ability to have a custom rename XML file, so that any update to the default rename file can be added without loosing any of your custom edits. And a ton more exciting new things, so stay tuned :cool:
     
    D

    DMember 49125

    Guest
    Hi RoChess,

    Is it possible to include an option that does not remove "The" from the beginning in the sortby value? (Exactly as the 3.0.7 version worked)
    :D
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Hi RoChess,

    Is it possible to include an option that does not remove "The" from the beginning in the sortby value? (Exactly as the 3.0.7 version worked)
    :D

    The way the scraper is now, is that it relies on the MovingPictures settings. So disable the article removal setting in advanced settings and it should work. If not then let me know and I'll retest that scenario in more detail, I'm currently improving the regular expression codes to make things run faster, and my screen is full with windows to test that, but it should work.

    Or are you saying you want the title to remove the article prefix and not remove it on the sortby?
     

    Users who are viewing this thread

    Top Bottom