Scraper request - www.kinopoisk.ru [RU]

Discussion in 'Moving Pictures' started by mitiok2008, February 14, 2009.

  1. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    Here is the thread for scraper for kinopoisk.ru (Russian movie resource). The scraper is created by LRFalk01 originally, and currenttly supported by myself (mitiok2008). Thank for everyone who spent a lot of time to create it and learn me.

    1. Please, be aware that good internet connection is very important. Kinopoisk.ru is well known as slugish site and if in addition you have trouble with your internet connection - you can have unexpected results.
    2. I recommend to store your movies each in separate folder called Russian (or Original) movie title. Torrent's names are usually very weird and it's hard to find exact matches for movies. From my expirience the collection of 120 movies put in separate folders was automaticaly matched for half-hour. The manual input was necessary only for 15 movies (actually only to prove propper movie from the list).


    Version's history:
    Update 1.2.8, 20.07.09
    - changes, due to changes on kinopoisk.ru

    Update 1.2.7, 05.04.09
    - search logic slightly updated - movies with Exact match first (if scraper founds Details page) is recognised correctly;
    - special symbols, like '-', '...', etc is loaded correctly.

    Update 1.2.5, 15.03.09
    - search for IMDB ID is add. For all non-russian movies IMDB ID search is available, it uses Alternate Title from Kinopoisk. Then Moving Pictures will take care to load backdrops from themoviedb.org. For russian movies - put backdrop.jpg file to movie folder. So far it's the easest option.

    Update 1.2.2 07.03.09
    - search for covers is updated. First it looks in cover_art folder, second in posters folder
    - director's pull is updated



    Original version 1.1.0, created by LRFalk01.
     

    Attached Files:

    • Kinopoisk.zip
      File size:
      30.6 KB
      Uploaded:
      July 20, 2009
      Views:
      1,441
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. fforde

    fforde Community Plugin Dev

    Joined:
    June 7, 2007
    Messages:
    2,666
    Likes Received:
    1,690
    Occupation:
    Software Engineer
    Location:
    Texas
    Ratings:
    +1,696 / 0
    Home Country:
    United States of America United States of America
    For what it's worth, if someone is interested in taking up this task, the Scraper Writing guide is located here:
    Scraper Scripts - Moving Pictures

    It still could use some work, but it should give a good feel for the scraper engine.
     
  4. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    I will take the ring to Mordor!
     
    • Like Like x 1
  5. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    Thank in advance! I'll provide you any translation you need.
     
  6. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    Attached is a scraper for kinopoisk.ru. Let me know what you think.

    -LRFalk01
     

    Attached Files:

    • Kinopoisk.zip
      File size:
      30.6 KB
      Uploaded:
      February 15, 2009
      Views:
      217
  7. I'am

    I'am Portal Member

    Joined:
    December 13, 2008
    Messages:
    10
    Likes Received:
    0
    Location:
    Voronezh
    Ratings:
    +0 / 0
    Home Country:
    Russian Federation Russian Federation
    Ok. Others data source and cover art source disabled.
    After few starting found 6 match (from >200 films) - 2 mach found every start.
    3 films called latin symbols - found one of them every start (download good description and 3-4 posters).
    3 others starting from numeric - "300" and "200 Pounds Beauty" and "XIII" - same. One of them every start.
    National symvols - :(

    Working script for buildin mediabase - http://rapidshare.com/files/198572476/kinopoisk_ru.zip.html - can it help?
     
  8. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    Are you saying that you have over 200 movies and this scraper is only matching 6 movies?

    -LRFalk01
     
  9. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    So far I'm not able to check working script. I'll do it today evemimg (mornig in US :)).
    But I do understand what I'am ment - it's very useful here to name movie in russian words but in latin transcript (not in Russian letters, but latins). Since, there are a lot of letters which don't have exact match - that is main problem. But, I checked several times on kinoposik.ru - it's able to find movies in such wiered situation. I think that we should adjust scraper adn it will work.
    As I said - I'll check it in few hours.
    Thank you!
     
  10. I'am

    I'am Portal Member

    Joined:
    December 13, 2008
    Messages:
    10
    Likes Received:
    0
    Location:
    Voronezh
    Ratings:
    +0 / 0
    Home Country:
    Russian Federation Russian Federation
    Yes. But others films named in our national codepage - i think problem in this moment.
     
  11. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    Well, it's gonna be a long way to Mordor, or at least to Evil's Empire (C) - Mr. Reagan :) ;)

    Positive: it's working - that is VERY GOD. Then some comments (I've attached log file and will follow it).

    You should start to read from 16-Feb-2009 21:46:29, because till then I was playing with MP settings. Then it starts - no general problems if you have movie with Original Movie title (English One). Even if there are some errors with encripting of movie name you can easely type it in "manual search" and it will find it. BUT it works only in case of close match of filename and Original Movie Title.
    When it's in Russian (in Cyrillic letters) or in Russian typed by English letters - nothing is working. For example: Ljubov.morkov.2 follows you to proper page for the movie to choose from - link number 1 is exact match. in your scraper 16-Feb-2009 21:47:25.... : No exact match... The same with others.
    This case doesn't work never in automatic mode nor manual.
    The worst case with russian letters : it doesn't work with any case - automatic or manual.

    I'll give you a small guide (hope it helps) - check for the differences for two movies : "dark knight 2008" or "Ljubov morkov 2". In case of importing first one is perfect, second is not. But in case of search into KINOPOISK.RU both of them retrieve close results - the structure of answer is similar.

    There are some other mistakes in posters and cover-art loading but let's break through the critical point. I think the idea of using script for video files can be useful (see couple of message above).

    Anyway - great start! Thank a lot!

    BTW, working script for building mediabase doesn't provide much cleaver results compared with yours. As I said - your scraper works perfect in case of proper Original Title. I think it's only the idea how to handle Russian letters.
     
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!