Rename database typos and issues (1 Viewer)

ejvdh

MP Donator
  • Premium Supporter
  • February 26, 2010
    32
    0
    Home Country
    Installed scraper 3.3.0. and testing now.
    Couple of small comments on the custom movie name list in 'rename database' file:
    - English title of HK movie Mou gaan Dou is 'Infernal Affairs' not 'Internal Affairs'
    - With POTC: Curse of the Black Pearl something went wrong with the title. To be really nitpicking: it's 'At World's End', instead of 'At Worlds End'.
    - Series 'The Girl...', 'Bourne' and 'Ocean' look a bit "weird" when renamed as per the list. I would suggest to use 'Millenium' or 'Millenium Series' for the 'The Girl with the Dragon Tattoo' and sequels and for Bourne and Ocean to keep the original titles, with only the "sortby" adjusted. Which is what I've done in the custom file, but it's a matter of personal preference, of course.
    - The subtitles (as in: extra title) for The Karate Kid remake (Kung Fu Kid) and Jurassic Park III (Return to the Island) were just working titles, so I would personally not like to use them as official movie title, but again, personal preference.
    - The first Chronicles of Narnia movie subtitle however (The Lion, the Witch and the Wardrobe) was part of the original title, but is missed out in the list.
    - I missed the Pink Panther series with Peter Sellers
    - Then what I would personally don't really like is the addition of 'I' to the first movies in series. They would be sorted first when omitting the I, isn't it? Otherwise the sortby can be adjusted.

    No criticism, just some personal comments. I've been looking for a scraper like this for a long time.
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.0

    - English title of HK movie Mou gaan Dou is 'Infernal Affairs' not 'Internal Affairs'

    Nice catch, fixed for next release

    - With POTC: Curse of the Black Pearl something went wrong with the title. To be really nitpicking: it's 'At World's End', instead of 'At Worlds End'.

    Nice catch, fixed as well. Could have sworn that did not had the apostrophe when I copy and pasted the title when I added the series.

    - Series 'The Girl...', 'Bourne' and 'Ocean' look a bit "weird" when renamed as per the list. I would suggest to use 'Millenium' or 'Millenium Series' for the 'The Girl with the Dragon Tattoo' and sequels and for Bourne and Ocean to keep the original titles, with only the "sortby" adjusted. Which is what I've done in the custom file, but it's a matter of personal preference, of course.

    Yeah, I struggled with those, there are however a lot of skins that do not offer a lot of room for the titles, so if at all possible I try to go for the shortest title possible. This sometimes means the Roman numeral goes inside the title; to prevent duplications as well. To fall in line with the normal format of first movie title being the name of the series, it would end up like "The Bourne Identity III: The Bourne Ultimatum" and for me personally that just looked weird.

    I also considered "The Bourne Trilogy III: Ultimatum", but then I found out they are making a 4th one. The existing movies would then have to be refreshed to get the correct title as well, so I preffered to go with a scheme that would avoid that as much as possible. The only exception being series that are not yet series. For example right now I use "Avatar" with sortby of "Avatar I" and once we get closer to release of "Avatar II", then I will change the first to "Avatar I", but at least it will group correct in 2012.

    But as you already found out, this is why you have that custom file, so any modifications you prefer can overrule the default one. This way I can keep auto-updating the default one in the future (working on that via plugin), and it will not touch your custom changes.

    - The subtitles (as in: extra title) for The Karate Kid remake (Kung Fu Kid) and Jurassic Park III (Return to the Island) were just working titles, so I would personally not like to use them as official movie title, but again, personal preference.

    It looked very weird with 4th movie getting a sub-title and then not have one for 5th one, if the working title is not a crazy one I tend to prefer using that then. But I looed again at imdb.com movieconnections and notice that a sequel is planned now for the remake/reboot. So I'm going to just remove it as 5th movie on the old series and adding it as "The Karate Kid (Remake) I" and then "II" for the upcomming sequel in 2013. They constantly make those adjustments at imdb.com as well, even for older series, so it is hard to stay up-to-date sometimes.

    On that note, it is always a tough choice between adding it to the old series or adding it as Remake or Reboot, but I'm always open for suggestions. If the original series is very old, then I actually consider renaming the old series to like (Original). Every day a new MovingPictures user is born and they might go "Sweep the leg, who?". Always makes me feel old, but I like classics and know references to movies made in black and white. Still the majority of MovingPictures users is not going to know they made "Straw Dogs" in 1971, so instead of adding the 2011 remake as "(Remake)", I went for renaming the 1971 version into "(Original)". This is once again personal preference indeed, but going by majority I'm thinking it is the best way to go.

    - The first Chronicles of Narnia movie subtitle however (The Lion, the Witch and the Wardrobe) was part of the original title, but is missed out in the list.

    Nice catch again, added as well.

    - I missed the Pink Panther series with Peter Sellers

    I have all the Peter Sellers ones myself, but upon adding the series to the XML file, I noticed that the movieconnections at imdb.com shows entries done by other actors. To me Sellers is the true Clouseau, so in the end I decided to keep them in my custom file, and not include them in the default rename.

    What do you think?

    - Then what I would personally don't really like is the addition of 'I' to the first movies in series. They would be sorted first when omitting the I, isn't it? Otherwise the sortby can be adjusted.

    Ommiting the "I" the movies get sorted at the end of the others. And you have to understand the sortby adjustment option was added afterwards, so to make it sort correctly I 'had' to rename the titles that way. Then once sortby adjustment was added it kinda stuck as to not confuse people with an existing collection who would add new series to their collection. At that point there were already like 700 movie entries in the database.

    It also made things more consisten with respect to the other movie series that do have a sub-title for 1st movie, such as "James Bond I: Dr. No", and as you pointed out the first Narnia. However, you could open the default rename XML file in notepad, and do CTRL+H -> ` I"` -> `"`, but you then have to manually undo those renames for the sortby field on Avatar, Independence Day, and X-Men: First Class. As those are the three series that are so far into the future for sequals that I'm not renaming the title on those yet.

    No criticism, just some personal comments. I've been looking for a scraper like this for a long time.

    I saw it all as positive criticism, and you caught some mistakes I made, so keep them coming.

    As for the other points; it will always be hard to create something universal, because everybody will have their own preference, but I'm trying to be as democratic as possible. In the end however it is me that ends up updateing this database file, so I do tend to go with how I want it before putting it up for a vote :cool:
     

    ejvdh

    MP Donator
  • Premium Supporter
  • February 26, 2010
    32
    0
    Home Country
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.1

    Thanks, Rochess, excellent work.
    Couple of comments:
    - I noticed the few glitches in the rename file are still there, probably accidentally uploaded old file.
    - A few more series you might want to add: the Mariachi Trilogy, Tom Ripley movies and Jack Ryan movies come to mind. Or did you have specific reasons not to add those?
    - About the plugin: thinking about the English/foreign title logic, it's slightly confusing as the options are shown now and with the first option 'on' (always original title), there is no point having the second option also on (add original title). I noticed that you then don't get the foreign title twice, luckily! Right now when you want the original title, with English in brackets, you have to first select that you don't want the original title at all (1st option 'off'), then that you want the foreign title in brackets and then that you want this inverted. Slightly warped logic! I understand that this is caused by regularly adding and changing options (generally improving things, I have to say!), but in my view there are 4 options that you could want:
    1. Always only English title
    2. Always only foreign title
    3. English with foreign in brackets
    4. Foreign with English in brackets
    Perhaps you could change the first 3 choices in the plugin into a multiple choice with the above options. Theoretically you have more options with title in local language in several configurations, but I don't think that will be very popular.

    Keep up the good work!
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.1

    - I noticed the few glitches in the rename file are still there, probably accidentally uploaded old file.

    That was done on purpose to test the new auto-update system, but I forgot to commit the new rename database when I did scraper v3.3.1 :D

    I'm working on some skin features now, where you will see an icon appear in like top right corner to indicate something updated. Then you will be able to go to the left-sided hidden menu and use like "About" to see info about current IMDb+ scraper script and current rename XML database info. Still working out the details.

    - A few more series you might want to add: the Mariachi Trilogy, Tom Ripley movies and Jack Ryan movies come to mind. Or did you have specific reasons not to add those?

    I might have had reasons before with different actors, different canons, but frankly I like your suggestions, even added the original Tom Ripley ones. So they are all added now via revision 107 (I made a jump from v0.97 to v107 to now stay in line with the Google Code revision system.

    This way I don't have to update the first post change list anymore and can just point people to: rename XML database r107. You can quickly see all the new changes/additions, and at the top right you can go back to the previous changes as well (r30 equals v0.97).

    - About the plugin: thinking about the English/foreign title logic, it's slightly confusing as the options are shown now and with the first option 'on' (always original title), there is no point having the second option also on (add original title).

    This is side effect of the conversion from the previous layout with the butons to this new listcontrol one. With the buttons layout I was able to disable the options that did not make sense when another option was enabled, and you would not be able to use them. Now on this new listcontrol that is not as easy to do anymore, but I might be able to use the IsHidden property to change color on the labels to make them look disabled and use the TVTag label to disable them in code. Before I forget again I'll make myself an issue on the google project website :D

    Keep up the good work!

    :D
     

    powermarcel10

    Retired Team Member
  • Premium Supporter
  • November 30, 2010
    2,839
    898
    35
    Groningen
    Home Country
    Netherlands Netherlands
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    Hi man,

    thanks for this awesome plugin.. The most important function for me to use thins plugin is because the option that it groups series of movies together.. One thing I have to nitice: I thougth you said (correct me if I'm wrong) that the whole progress is going automatically.. I have to do a forced update of all movies first before I have the right result.. Is that correct??
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    Hi man,

    thanks for this awesome plugin.. The most important function for me to use thins plugin is because the option that it groups series of movies together.. One thing I have to nitice: I thougth you said (correct me if I'm wrong) that the whole progress is going automatically.. I have to do a forced update of all movies first before I have the right result.. Is that correct??

    It goes automatic for any new movie imported (the ones that have an entry of course) via the IMDb+ scraper-script.

    The rename+group option has to be enabled of course, the scraper-script default is 'off', and the IMDb+ plugin default is 'on', this is because the XML file is needed and it is still possible to install/use the scraper-script without the plugin.

    So during the initial install if you immediatly begin importing movies, it is possible for a race condition in which the IMDb+ plugin did not get around to grab the latest rename database file and enable the 'on' setting to use it.

    It also doesn't work for any existing movies imported (even if those are done via IMDb+ at a time there was no entry in the rename database).

    To solve the problem of other scrapers originally having imported the movie, you can use the "Force IMDb+..." option to switch all those movies over from for example TheMovieDb or default IMDb to start using IMDb+ scraper script. That means you then still have to update/refresh the movie details, which is what the hidden menu refresh options are for inside IMDb+ plugin.

    That part is still manual right now, as in you have to push the button to start the refresh process. And it is that part that will have an automated setting soon to like once a month/quarterly refresh/update your entire collection. Because summary/crew/genre/etc don't exactly change much there is the option "Refresh all fields" which by default is 'off', that will only update score + votes + certification (MPAA does sometimes adjust ratings) when you refresh a movie (first time import gets everything).

    The rename+group database is maintained manually, I try to stay ahead of the game, such as for example the new James Bond movie finally got the subtitle "Skyfall", so I already put that into the rename database which I'll release soon when I finish those 350 titles. Normally that means I've added entries well before anybody would actually end up adding that movie to their collection so it would rename auto, but I also sometimes add older series or correct existing ones (typos/etc). You can use the Google Code project website DIFF function to see all the changes that happened, but an easier way is to on occasion use the "Refresh" -> "Rename Database Only" method that will 'update' only the movies in your collection that match up to an entry in the rename database.

    Currently I'm up to 1187 rename entries, which is quite a bit up from the 903 entries that exist in the current v1.0.6 version that everybody has. But I want to finish all those 350 movie titles first, so that everybody only has to run the refresh method once to get the mass-update reflected in their collection (Hopefully everybody likes the naming-schemes I used, though odds are that most entries go unused, because I added quite a few rare/old movies that happen to share title with modern releases). I expect once I'm done the rename database will be close to 1500 entries.
     

    vpupkin

    Portal Pro
    March 26, 2011
    84
    8
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    The rename+group database is maintained manually, I try to stay ahead of the game, such as for example the new James Bond movie finally got the subtitle "Skyfall", so I already put that into the rename database which I'll release soon when I finish those 350 titles.

    Hi RoChess, don't know if this is already a part of your upcoming update, but there is a whole bunch of Tom & Jerry movies (The Fast and the Furry; Meet Sherlock Holmes etc) that I don't see in the DB. If Futurama is on the list, I think it's fair game.

    What is your policy on non-english (mainly, Asian) groups / sequels? There are tons of these, and I am not sure you want to add it to the main list (considering preferred naming and all), yet I see e.g. Police Story...
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    What is your policy on non-english (mainly, Asian) groups / sequels? There are tons of these, and I am not sure you want to add it to the main list (considering preferred naming and all), yet I see e.g. Police Story...

    In the beginning the larger the rename list grew, the longer it took to process it, so I tried to keep things as small as possible covering only the major series. But with the adjustment in matching method there is no delay anymore, no matter how large the list grows (well to some extend of course). While I'm still busy in going through those 350 titles that have duplicates at imdb.com, help me out some, just provide me the series you want with the rename XML file syntax already done (as if you were adding them to the custom rename database), so I can just copy and paste the results.

    PS: For Asian/Foreign movies, provide them in "English title (Foreign title)" format as the existing ones are done as well.
     

    Johan

    Portal Pro
    April 19, 2006
    443
    11
    48
    Home Country
    Sweden Sweden
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    Rochess! You are frickin crazy! :)
    I was just browsing the rename dbase and wow! What a pain to fill in all this!

    And on top of this you are doing the scraper script updates as well as the plug-in to setup and maintain the lists.

    I love you man! :)


    A little question. What will happen if I add a movie to the custom rename db that are already present in the original rename dbase.
    Will the custom db override original?

    I am just thinking about making my own "custom" for the swedish titles that I have.

    I guess I then should put "yes" on update all fileds and yes on rename.

    If I do that I might get swedish titles (when available) for non-sequels and "hopefully" swedish names for the sequels that I add in the custom rename db?
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    Re: IMDb+ Scraper - Movie Scraping on Steroids v3.3.7

    Rochess! You are frickin crazy! :)
    I was just browsing the rename dbase and wow! What a pain to fill in all this!

    Perfect example of a small project run out of hand (most is just copy and paste though) :D

    And on top of this you are doing the scraper script updates as well as the plug-in to setup and maintain the lists.

    Well I started small and whenever I feel like it add a few more entries. A few IMDb+ users have submitted entries that I was able to just copy and paste. This list of 350 titles however is the largest nightmare so far. I have to load the search result at imdb.com then verify each multiple entry to see if they qualify, then load their movie aka and connection page and decide on a proper rename title (if it's a series, reboot, remake, different story, etc). So far I'm down to roughly 175 titles which have lead to almost 300 extra entries in rename database at the moment.

    Will the custom db override original?

    Yes, it goes by:

    1. IMDb+ title result from imdb.com main page (tries to detect English, and depends on other scraper-script options, such as country+language filters)
    2. English title correction if needed (depends on scraper-script options as well and is done via the AKA page)
    3. default rename dbase override (this is now always done, but tt-ID still has to match)
    4. custom rename dbase override (this is now always done, but tt-ID still has to match)

    The script will literally overwrite the title each time with the custom rename being the last one processed. It's a little messy, but this is because the XML scraper-script system is not really designed for what I'm pushing it to do :cool:
     

    Users who are viewing this thread

    Top Bottom