CSFD scraper script 0.3.1 [CZ]

Discussion in 'Moving Pictures' started by Kucheek, February 1, 2014.

  1. Kucheek

    Kucheek Portal Member

    Joined:
    December 18, 2008
    Messages:
    22
    Likes Received:
    8
    Gender:
    Male
    Occupation:
    Software Test Analyst
    Location:
    Prague
    Ratings:
    +8 / 0
    Home Country:
    Czech Republic Czech Republic
    Show System Specs
    This is continuation of this thread.



    Czech scraper script for CSFD.cz
    Script for great MovingPictures plugin.
    • Movie name could be in original, English or Czech language
    • Title, Aka Titles, Year, Directors, Actors, Genres, Score and Summary are retrieved from CSFD
    • Writers, Certification, Language, Tagline and Runtime are retrieved from IMDb
    • First original title from CSFD details page is used as alternate title for matching
    • Functional matching based on IMDb ID from NFO files
    • Certification is UK with Czech description (e.g. 15 = Přístupné od 15 let)
    • Articles (The, A, An, Ein, Das, Der, Die, El, Les, Un and Une) are moved to the beginning of the original movie names
    • Not all Languages retrieved from IMDb are translated in Czech
    • Writers are retrieved from IMDb so they are without accents
    • Tagline is retrieved from IMDb so it is in English
    • Runtime is retrieved from IMDb but if it wasn't found on IMDb than script attempt to find it on CSFD
    Version: 0.3.1
    • Fixed title recognition for case when CSFD returns details page upon search and the movie has a film type defined (TV Movie, Drama etc.)
    Version: 0.3.0

    The goal of this version was to make the auto approvals based on NFO files with IMDb IDs possible and also to obtain better matches using alternative movie names when the scraper has to rely only on the filename title. Summary retrieval was also fixed.
    • IMDb ID and alternative title from the detail pages of found movies are now retrieved already in the scraper search phase, which is necessary for correct match and also makes the IMDb ID auto-approval possible.
    • Repaired retrieving Summary from CSFD
    • Repaired recognizing of details page
    Installation:
    1. Download the .xml file (attachment at bottom of this post).
    2. Open "MediaPortal Configuration", go to the "Plugins", select "Moving Pictures" and "Config".
    3. Select the "Importer Settings" tab.
    4. In the "Data Sources" section select the "Manually manage movie data sources" radio button.
    5. Click the "Movie Details Data Sources" button.
    6. In the popup click the arrow just to the right of the "+" button and pick "Add a New Data Source".
    7. Browse to the .xml scraper file you have downloaded and click OK.
    8. It should automatically update the existing "CSFD.cz" scraper to new version.
     

    Attached Files:

    • CSFD.xml
      File size:
      16.2 KB
      Uploaded:
      April 12, 2014
      Views:
      328
    Last edited: April 12, 2014
    • Like Like x 1
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. ltfearme
    • Premium Supporter

    ltfearme Community Plugin Dev

    Joined:
    June 10, 2007
    Messages:
    6,457
    Likes Received:
    4,241
    Gender:
    Male
    Occupation:
    Software Test Engineer
    Location:
    Sydney
    Ratings:
    +5,385 / 0
    Home Country:
    Australia Australia
    I added this to svn r1566.
     
  4. JiRo
    • Premium Supporter

    JiRo MP Donator

    Joined:
    May 1, 2009
    Messages:
    184
    Likes Received:
    40
    Gender:
    Male
    Occupation:
    Sales Manager
    Location:
    Prague
    Ratings:
    +44 / 0
    Home Country:
    Czech Republic Czech Republic
    Show System Specs
    Hi Kucheek.

    Yesterday installed new MP client and then tried your new Csfd scraper. Work well. Thanks for your perfect work, only two little mistakes was found only:

    Oběšenec.mkv - Oběšenec <span class="film-type">(TV film)</span> (2001)
    Lijavec.avi - Lijavec <span class="film-type">(divadelní záznam)</span> (1997)

    No tragedy, quick hand edit and both are perfect :) If you need do some testing work, I´m ready for it.

    JiRo.
     
    • Thank You! Thank You! x 1
  5. Kucheek

    Kucheek Portal Member

    Joined:
    December 18, 2008
    Messages:
    22
    Likes Received:
    8
    Gender:
    Male
    Occupation:
    Software Test Analyst
    Location:
    Prague
    Ratings:
    +8 / 0
    Home Country:
    Czech Republic Czech Republic
    Show System Specs
    Hi JiRo, thanks for pointing that out. The two mistakes you discovered, it appears both were a combination of two conditions which don't happen that much - movie had a type defined in the title and at the same time, the search functionality retrieved directly the details page for the movie. I have modified the regular expression to count with such cases.

    The fix is in the version 0.3.1, see the first post with attached file.

    Cheers
    Kucheek
     
  6. JiRo
    • Premium Supporter

    JiRo MP Donator

    Joined:
    May 1, 2009
    Messages:
    184
    Likes Received:
    40
    Gender:
    Male
    Occupation:
    Sales Manager
    Location:
    Prague
    Ratings:
    +44 / 0
    Home Country:
    Czech Republic Czech Republic
    Show System Specs
    Hi Kucheek.

    CSFD changes part "Obsah" in the movie detail. I've tried fix it. Because you take care about CSFD scraper i give you this draft for your supervision.

    Cheers,
    Jiro.
     

    Attached Files:

  7. Kucheek

    Kucheek Portal Member

    Joined:
    December 18, 2008
    Messages:
    22
    Likes Received:
    8
    Gender:
    Male
    Occupation:
    Software Test Analyst
    Location:
    Prague
    Ratings:
    +8 / 0
    Home Country:
    Czech Republic Czech Republic
    Show System Specs
    Hi Jiro, sorry for late response, please go ahead and modify the script as necessary. As I no longer use MediaPortal (I moved my HTPC to Linux platform), I leave it up to others to keep the script up-to-date. Thanks
     
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!