Movies - enhanced details + cover retrieval + auto matching (1 Viewer)

gamejester

Retired Team Member
  • Premium Supporter
  • May 13, 2007
    418
    37
    Home Country
    United Kingdom United Kingdom
    Automatching can only do so much; it works by comparing each of the movies names returned from the search of the details site aganst the file name and computes Levenshtein distance gicing a number between 0 and 100 for each; at the end of it the lowest scored result wins. This becomes less accurate the unrelated words you have in what it is execting to be the title.

    Any that it gets wrong you can flick through them within MP, press or (or F9) and do an IMDB refresh, this will give you a list of possible matches and you can choose the one you want. If you un-tick 'get actors' in the config app then it is really quick to retrive the movie info.

    You could always knock up a script to delete un-wanted files for you, say delete any file in the dir under 350meg; you could also get it to rename the file to the name of the dir.
     

    powerfix

    Portal Member
    December 7, 2006
    8
    0
    Well you would not want your movie displayed in the GUI as "Eagles, The" that would look a bit rubbish; so best to keep the title in the db correct. Also this would make al the built in functions break as it is used as a key field when do lookups for covers and IMDB info, etc...

    What could be done (maybe) is a change to the way the Title db retrival is order so all movies that start with "The" are in alphabetical order starting from the second word.......................hummm, not looked at this bit of the code, but I will add it to the feature requests in the first post.

    Funnily enough, I've got them displaying as "Shawshank Redemption, The" in the database, just so I can get the list the to display the actual title instead of "The..." first. Otherwise, I've got literally 100 movies that all begin with "The" and they are all a sub-sort of the original list. Hence why I change it.

    As you've suggested, I think the list ignoring "The" if it is the first word would actually be the better solution, or at least the option to. It is a much better idea and solves my problem! :D

    You're a champ.

    Cheers,

    pwrfix.
     

    seco

    Retired Team Member
  • Premium Supporter
  • August 7, 2007
    1,575
    1,239
    Home Country
    Finland Finland
    Automatching can only do so much; it works by comparing each of the movies names returned from the search of the details site aganst the file name and computes Levenshtein distance gicing a number between 0 and 100 for each; at the end of it the lowest scored result wins. This becomes less accurate the unrelated words you have in what it is execting to be the title.

    Any that it gets wrong you can flick through them within MP, press or (or F9) and do an IMDB refresh, this will give you a list of possible matches and you can choose the one you want. If you un-tick 'get actors' in the config app then it is really quick to retrive the movie info.

    You could always knock up a script to delete un-wanted files for you, say delete any file in the dir under 350meg; you could also get it to rename the file to the name of the dir.

    Yes, I understand.

    But it's weird that in those cases that I get wrong match, if I do manual search I get a list of choices and the correct one is almost every time first in the list. Isn't the first in the list one with lowest score (best match)?

    For example "300" I mentioned earlier, it's first in the manual list but doesn't get automatically selected, instead automatic scan selects 6th or 7th from the list which is something totally wrong.

    And also In my cases I think the year of the movie isn't used for matching at all, shouldn't that be the case?

    Thanks!
     

    gamejester

    Retired Team Member
  • Premium Supporter
  • May 13, 2007
    418
    37
    Home Country
    United Kingdom United Kingdom
    As we cannot rely on the first item returned in the list being the one you want we compare what we think is the movie title based on the file name to each returned movie from the search. The reason yours is matching to other movies is because you have file names with additional straings.

    So it matches you file name somthing like

    300 720p dimmension h.265 5.1 other stuff.mkv

    agains the returned list

    300
    300 mil do niebi
    300 Premiership Goals
    etc....

    As far as the algorithum is concerned one of the later ones will score higher as a match than 300 as they will have more characters and length in common with your original string.

    Nothing we can do about that as the movie might actual be called 300 720p dimmension h.265 5.1 other stuff in which case we are doing the right thing.

    The only ways to fix this are:

    1. You fix up your files names by creating a script that renames the file to that of the folder - this is your quickest and cleanest solution
    2. We add a new feature whereby it allowes the user to add strings to string from a file name - this will have to be maintained by each user and as the scene changes might be a pain to keep upto date.
    3. We fix up the 'use folder' option and treat each folder as a movie, take the name from that auto select any large movie file and coverart in that folder - this will work well but we are under change freeze so will not be in the core until 1.0.1

    As for the year it can only get the year if it is stored in brackets, either (1999) or [1999] otherwise we have no idea if the digits are part of the movie title or the year, so leave them alone.
     

    Spaldo

    MP Donator
  • Premium Supporter
  • May 7, 2008
    495
    12
    FlashFXP Development Team
    Home Country
    Well you would not want your movie displayed in the GUI as "Eagles, The" that would look a bit rubbish; so best to keep the title in the db correct. Also this would make al the built in functions break as it is used as a key field when do lookups for covers and IMDB info, etc...

    What could be done (maybe) is a change to the way the Title db retrival is order so all movies that start with "The" are in alphabetical order starting from the second word.......................hummm, not looked at this bit of the code, but I will add it to the feature requests in the first post.

    I concur, please dont change the way the title is... We don't want Eagles, The ... Beside the fact they are actually called "The Eagles" it looks silly and makes the scanner go all weird.

    FYI, I am going through some of the concerts that are not stored in MovieXML and adding them in along with the song list and cover / art work :)

    which grabber are you using?

    I will enable that setting and step through the code and see what it is supposd to do, I am using it was designed to take the folder name as the movie title, but i have never looked at it so I don't know yet.

    Another thing that could be added as a new feature would be to have a user definable area where you add strings that you want removed from a file name when extracting the title..............like MPTVSeries plugin does.

    Seco,

    I went through a bunch of my files and named them like they are listed in the MovieXML site and after using gamejester's new version combined I am getting great results.

    There are a few here & there that are coming up wrong. Ie. Spider-Man 3 wont find at all, and 2 others that were matched way wrong (escapes my mind) I am getting much much much better results than upon my first attempts.

    Thanks gamejester for the efforts so far, the fuzzy logic is performing better, but I will try to give you a couple of examples that may assist in getting that percentage match just a tad higher :)
     

    seco

    Retired Team Member
  • Premium Supporter
  • August 7, 2007
    1,575
    1,239
    Home Country
    Finland Finland
    I'm not an expert, but

    As for the year it can only get the year if it is stored in brackets, either (1999) or [1999] otherwise we have no idea if the digits are part of the movie title or the year, so leave them alone.

    If we find 4-number digit in the file name, why shouldn't we try to see if it matches the year of the movie information and if it does, give a higher priority for that match?

    As far as the algorithum is concerned one of the later ones will score higher as a match than 300 as they will have more characters and length in common with your original string.

    I think we shouldn't give length matching that much power..

    I'm not going to rename my files because 1) it's too much work and 2) I want to keep the information that I have in file names.

    So the folder name would be best way for me. For example TVSeries plugin can scan & parse scene release filenames very nice.
     

    gamejester

    Retired Team Member
  • Premium Supporter
  • May 13, 2007
    418
    37
    Home Country
    United Kingdom United Kingdom
    It does more than that, i was just dumbing it down for an example everyone will understand, if you want more info look at this.

    Levenshtein distance - Wikipedia, the free encyclopedia

    The year guess is not possible as you have described it as the sections of code which deal with the different bits of the process are re-used by several different functions and would require a re-write or a lot of additional overloaded functions. At the point when it finds the year it has to KNOW it is the year and not GUESS; it then passes this out to external csscripts as a known constraint in order to locate matching movies on the info sites.

    As previously stated the 2 ways we can address this is 1. do like mytvseries does and let the user specify a list of keywords to stip of the name or 2. use folders as names.

    Added them to the feature request section.
     

    Spaldo

    MP Donator
  • Premium Supporter
  • May 7, 2008
    495
    12
    FlashFXP Development Team
    Home Country
    Seco... I recent was getting conflicts with about 40 movies. I added the year to the file name. ie. Commando (1985) and now I am getting correct matches on these titles...

    What I am getting at is, perhaps do the scan and just see which ones come up as a conflict/incorrect then change those file names, it may be a reasonable job at first, but you only have to do it once.
     

    Paranoid Delusion

    Moderation Manager
  • Premium Supporter
  • June 13, 2005
    13,062
    2,978
    Cheshire
    Home Country
    United Kingdom United Kingdom
    Seco... I recent was getting conflicts with about 40 movies. I added the year to the file name. ie. Commando (1985) and now I am getting correct matches on these titles...

    I did this a year ago, did'nt half improve scanning results, a bit of a pain to do, but i worked on the pricipal only had to do it once, and now every time i rip a dvd, i cut'n'paste name\year from imdb.

    Worth every minute spent doing this, so can highly recommend.
     

    Users who are viewing this thread

    Top Bottom