Spanish Scraper FilmAffinity.com with IMDb.es bonus to get fanarts -- v2.1.0 | Page 5

Discussion in 'Moving Pictures' started by RoChess, December 28, 2009.

  1. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,172
    Likes Received:
    1,301
    Ratings:
    +1,675 / 2
    Just add more RegExp noise filters.

    \s?\[Spanish.+?\]|\s?\[\D+\]|......



    The "\s?\[Spanish.+?\]" part is more vigurous.

    \s = space
    ? = previous RegExp is optional
    \[ = look for '[' character, the \ is needed to escape, because [ is used for Regular Expression definitions
    Spanish = look for 'Spanish'
    . = match any character
    +? = keep looking for the previous RegExp, but stop on first match of next character
    \] = look for ']' character

    So it will grab " [Spanish.......]" and "[Spanish.......]" not caring at all what follows Spanish. But since the [ and ] chars have to be around it, it will not harm a movie title such as "The Spanish Prisoner".
     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. Friks

    Friks Portal Member

    Joined:
    September 16, 2009
    Messages:
    16
    Likes Received:
    1
    Ratings:
    +1 / 0
    Home Country:
    Spain Spain
    Here is the log. Thank you very much!!!!
     
  4. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,172
    Likes Received:
    1,301
    Ratings:
    +1,675 / 2
    Your problem seems to be Noise Filter related as well.

    You need to adjust the default noise filter to remove all the parts of the filename that is NOT related to the Title, year, or part number.

    "American Gangster Spanish HDDVDRip By FreAk TEAm.avi"

    Does not get cleaned up enough by the default noise filter, causing the scraper serious problems trying to find a match.

    You would need something like:

    \sSpanish|\sHDDVDRip|\sBy\sFreAk\sTEAm|.....​

    But you run a giant risk of killing a valid movie title then such as "The Spanish Prisoner". What you really should do is adapt to an NFO system, or rename all your files in the more common "Movie Title (Year).extension" format.

    I do not know how to build Artificial Intelligence into the scraper (you are welcome to assist me), so your alternative is to keep adjusting the NoiseFilter to become compatible with all the crazy filenames you have.

    Note: There might still be a problem in the scraper, but you need to enable the scraper debug log option for me to check into that (that little green bug icon, it is in the FAQ), and limit the log files to a *single* new movie being scraped. Either adjust the advanced settings to only run *1* scraper session at the time, or simply wait till all your existing movies are imported (meaning if you launch MovingPictures configuration screen that the import section remains empty, or use log file to check that), and then re-import or add just *ONE* movie. The problem is that by default MovPic will scrape 5 movies at the same time, which makes it incredibly hard to read your log files. It's like 5 movies being played at the same time, but you only get one TV to watch them on.
     
    • Like Like x 1
  5. Gixxer
    • Team MediaPortal

    Gixxer Retired Team Member

    Joined:
    August 18, 2007
    Messages:
    1,383
    Likes Received:
    41
    Occupation:
    Mechanical Engineer
    Location:
    Spain
    Ratings:
    +41 / 0
    Home Country:
    Spain Spain
    great thanks!!

    with the new part it cleans the titles much better.

    however, i just got auto approved 2 movies out of 40.

    here is the log.

    I have done a new database, altered noise filter to include your mod, and modified from 1 to 3 years the accepted interval of year difference. also set the autoapprove to reckless.

    is this a problem of the scraper ???:D
     
  6. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,172
    Likes Received:
    1,301
    Ratings:
    +1,675 / 2
    Could be, but you need to provide the right log files for me to find out.

    I need scraper log files then. That little green bug needs to be on, as shown in FAQ. And you need to limit it to one movie. The movingpictures.log file on a single movie already leads to like 5MB of log file, imagine what 100 movies do.

    I'll make a new FAQ entry otherwise on how to properly create log files for scraper problems. Adjusting the year difference should not be needed, but if the right movie shows in the drop down box inside Configuration screen, it means another scraper picked the right movie, and only the first scraper gets auto-approval permission (unless you adjust advanced settings).
     
  7. franky52

    franky52 Portal Member

    Joined:
    December 15, 2008
    Messages:
    12
    Likes Received:
    0
    Ratings:
    +0 / 0
    Home Country:
    Spain Spain
    Hi RoChess!!
    Thanks for your scraper. It helps to all spanish community. :D:D:D

    But filmaffinity has changed something in their webpage. There are some parametres that can't be retrieved, as genre or sinompsys. And images are not in the same location, or at least I couldn' reach them with the webpage info contained in the scraper.

    I've changed the code a little bit so the argument can be retrieved. But it's been impossible with the rest, as I can't understand quite well how the scraper works. I woldn't mind updating this scraper as long as i could understand it a little bit more.

    Would it be possible getting some kind of guide....????

    Wether you need some logs, meake me know and I'll attach a completely new log with just 1 movie.

    Thanks again.
     
  8. Roberman

    Roberman Portal Member

    Joined:
    February 9, 2010
    Messages:
    12
    Likes Received:
    4
    Ratings:
    +4 / 0
    Home Country:
    Spain Spain
    Hello.

    I recetly discover Moving Pictures, and i like it very much. I want to use it with my movies collection.
    Try this scraper but it seems that some information have change in filmaffinity and not all data can be retreive.

    I have limited knowledge on regular expresions but I have try to modify the scraper so it can read the genre list.
    After a few test I think I got it.

    It is little work, but maybe someone can find it usefull.

    If I have time (and luck) i will try to fix other problems, like the summary data.

    I take the liberty to change the scraper version (to version 1.0.5 :D)
     

    Attached Files:

    • Like Like x 1
  9. Gixxer
    • Team MediaPortal

    Gixxer Retired Team Member

    Joined:
    August 18, 2007
    Messages:
    1,383
    Likes Received:
    41
    Occupation:
    Mechanical Engineer
    Location:
    Spain
    Ratings:
    +41 / 0
    Home Country:
    Spain Spain
    roberman i will try your version. thanks for trying to improve it. hopefully we can get a fully working spanish scraper
     
  10. franky52

    franky52 Portal Member

    Joined:
    December 15, 2008
    Messages:
    12
    Likes Received:
    0
    Ratings:
    +0 / 0
    Home Country:
    Spain Spain
    So yo got genre working???

    What I got lurking into the code is summary. Mixing both versions and with some more research we can make a perfect spanish scraper.

    I'm not at home now, but tomorrow i'll try to mix yours and mine and I'll publish to test it.



     
  11. franky52

    franky52 Portal Member

    Joined:
    December 15, 2008
    Messages:
    12
    Likes Received:
    0
    Ratings:
    +0 / 0
    Home Country:
    Spain Spain
    As I promised, the new version with genre and description working.

    I tested with just 1 film (I'm out home and that's what I got :D).

    There is one thing I'd like to correct with description. Filmaffinity ads at the end of the description "FILMAFINITTY". I don't know if it can be removed.

    And with genres, the only film I tested (Los Sustitutos "Surrogates") has 4 genres (Ciencia-Ficción. Acción. Intriga. Robots) but the scrapper only accepts last 3 one.

    I will keep on working on this but I'd like to have some king of guide on how it works.


    View attachment FilmAffinity (IMDb.es) v1.0.6.xml
     
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!