Spanish Scraper FilmAffinity.com with IMDb.es bonus to get fanarts -- v2.1.0 (3 Viewers)

RoChess

Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #41
    analyzing your explanation, i can understand that the 5.1 is causing an issue as it identifies it as "maybe a year".

    am i right? any way of keeping the digits only if they come in the form of 4 consecutive digits?

    Just add more RegExp noise filters.

    \s?\[Spanish.+?\]|\s?\[\D+\]|......

    The "\s?\[Spanish.+?\]" part is more vigurous.

    \s = space
    ? = previous RegExp is optional
    \[ = look for '[' character, the \ is needed to escape, because [ is used for Regular Expression definitions
    Spanish = look for 'Spanish'
    . = match any character
    +? = keep looking for the previous RegExp, but stop on first match of next character
    \] = look for ']' character

    So it will grab " [Spanish.......]" and "[Spanish.......]" not caring at all what follows Spanish. But since the [ and ] chars have to be around it, it will not harm a movie title such as "The Spanish Prisoner".
     

    Friks

    Portal Member
    September 16, 2009
    16
    1
    Home Country
    Spain Spain
    I'm testing this scraper (great work RoChess!!!) but It always downloads the summary in english. I don't understand why this behaviour because the "Primary Source" field is fill with "FilmAffinity.com (with IMDb.es)". Do you know where is the matter?

    That sounds like FilmAffinity has changed their website, so the scraper fails to get the summary in Spanish. Missing fields are then filled in by the IMDb English scraper.

    I would need a scraper debug enabled debug log (check FAQ on how to enable that), that shows me exactly what goes wrong at what point (try to reproduce the problem in the least amount of steps needed, that way log file is easy to read).

    Here is the log. Thank you very much!!!!
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #43
    I'm testing this scraper (great work RoChess!!!) but It always downloads the summary in english. I don't understand why this behaviour because the "Primary Source" field is fill with "FilmAffinity.com (with IMDb.es)". Do you know where is the matter?

    That sounds like FilmAffinity has changed their website, so the scraper fails to get the summary in Spanish. Missing fields are then filled in by the IMDb English scraper.

    I would need a scraper debug enabled debug log (check FAQ on how to enable that), that shows me exactly what goes wrong at what point (try to reproduce the problem in the least amount of steps needed, that way log file is easy to read).

    Here is the log. Thank you very much!!!!

    Your problem seems to be Noise Filter related as well.

    You need to adjust the default noise filter to remove all the parts of the filename that is NOT related to the Title, year, or part number.

    "American Gangster Spanish HDDVDRip By FreAk TEAm.avi"

    Does not get cleaned up enough by the default noise filter, causing the scraper serious problems trying to find a match.

    You would need something like:

    \sSpanish|\sHDDVDRip|\sBy\sFreAk\sTEAm|.....​

    But you run a giant risk of killing a valid movie title then such as "The Spanish Prisoner". What you really should do is adapt to an NFO system, or rename all your files in the more common "Movie Title (Year).extension" format.

    I do not know how to build Artificial Intelligence into the scraper (you are welcome to assist me), so your alternative is to keep adjusting the NoiseFilter to become compatible with all the crazy filenames you have.

    Note: There might still be a problem in the scraper, but you need to enable the scraper debug log option for me to check into that (that little green bug icon, it is in the FAQ), and limit the log files to a *single* new movie being scraped. Either adjust the advanced settings to only run *1* scraper session at the time, or simply wait till all your existing movies are imported (meaning if you launch MovingPictures configuration screen that the import section remains empty, or use log file to check that), and then re-import or add just *ONE* movie. The problem is that by default MovPic will scrape 5 movies at the same time, which makes it incredibly hard to read your log files. It's like 5 movies being played at the same time, but you only get one TV to watch them on.
     

    Gixxer

    Retired Team Member
  • Premium Supporter
  • August 18, 2007
    1,383
    41
    40
    Spain
    Home Country
    Spain Spain
    great thanks!!

    with the new part it cleans the titles much better.

    however, i just got auto approved 2 movies out of 40.

    here is the log.

    I have done a new database, altered noise filter to include your mod, and modified from 1 to 3 years the accepted interval of year difference. also set the autoapprove to reckless.

    is this a problem of the scraper ???:D
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    • Thread starter
    • Moderator
    • #45
    great thanks!!

    with the new part it cleans the titles much better.

    however, i just got auto approved 2 movies out of 40.

    here is the log.

    I have done a new database, altered noise filter to include your mod, and modified from 1 to 3 years the accepted interval of year difference. also set the autoapprove to reckless.

    is this a problem of the scraper ???:D

    Could be, but you need to provide the right log files for me to find out.

    I need scraper log files then. That little green bug needs to be on, as shown in FAQ. And you need to limit it to one movie. The movingpictures.log file on a single movie already leads to like 5MB of log file, imagine what 100 movies do.

    I'll make a new FAQ entry otherwise on how to properly create log files for scraper problems. Adjusting the year difference should not be needed, but if the right movie shows in the drop down box inside Configuration screen, it means another scraper picked the right movie, and only the first scraper gets auto-approval permission (unless you adjust advanced settings).
     

    franky52

    Portal Member
    December 15, 2008
    12
    0
    Home Country
    Spain Spain
    Hi RoChess!!
    Thanks for your scraper. It helps to all spanish community. :D:D:D

    But filmaffinity has changed something in their webpage. There are some parametres that can't be retrieved, as genre or sinompsys. And images are not in the same location, or at least I couldn' reach them with the webpage info contained in the scraper.

    I've changed the code a little bit so the argument can be retrieved. But it's been impossible with the rest, as I can't understand quite well how the scraper works. I woldn't mind updating this scraper as long as i could understand it a little bit more.

    Would it be possible getting some kind of guide....????

    Wether you need some logs, meake me know and I'll attach a completely new log with just 1 movie.

    Thanks again.
     

    Roberman

    Portal Member
    February 9, 2010
    12
    4
    Home Country
    Spain Spain
    Hello.

    I recetly discover Moving Pictures, and i like it very much. I want to use it with my movies collection.
    Try this scraper but it seems that some information have change in filmaffinity and not all data can be retreive.

    I have limited knowledge on regular expresions but I have try to modify the scraper so it can read the genre list.
    After a few test I think I got it.

    It is little work, but maybe someone can find it usefull.

    If I have time (and luck) i will try to fix other problems, like the summary data.

    I take the liberty to change the scraper version (to version 1.0.5 :D)
     

    Attachments

    • FilmAffinity (IMDb.es) v1.0.5.xml
      30.7 KB

    Gixxer

    Retired Team Member
  • Premium Supporter
  • August 18, 2007
    1,383
    41
    40
    Spain
    Home Country
    Spain Spain
    roberman i will try your version. thanks for trying to improve it. hopefully we can get a fully working spanish scraper
     

    franky52

    Portal Member
    December 15, 2008
    12
    0
    Home Country
    Spain Spain
    So yo got genre working???

    What I got lurking into the code is summary. Mixing both versions and with some more research we can make a perfect spanish scraper.

    I'm not at home now, but tomorrow i'll try to mix yours and mine and I'll publish to test it.



    Hello.

    I recetly discover Moving Pictures, and i like it very much. I want to use it with my movies collection.
    Try this scraper but it seems that some information have change in filmaffinity and not all data can be retreive.

    I have limited knowledge on regular expresions but I have try to modify the scraper so it can read the genre list.
    After a few test I think I got it.

    It is little work, but maybe someone can find it usefull.

    If I have time (and luck) i will try to fix other problems, like the summary data.

    I take the liberty to change the scraper version (to version 1.0.5 :D)
     

    franky52

    Portal Member
    December 15, 2008
    12
    0
    Home Country
    Spain Spain
    As I promised, the new version with genre and description working.

    I tested with just 1 film (I'm out home and that's what I got :D).

    There is one thing I'd like to correct with description. Filmaffinity ads at the end of the description "FILMAFINITTY". I don't know if it can be removed.

    And with genres, the only film I tested (Los Sustitutos "Surrogates") has 4 genres (Ciencia-Ficción. Acción. Intriga. Robots) but the scrapper only accepts last 3 one.

    I will keep on working on this but I'd like to have some king of guide on how it works.


    View attachment FilmAffinity (IMDb.es) v1.0.6.xml
     

    Users who are viewing this thread

    Top Bottom