Scraper request - www.kinopoisk.ru [RU] | Page 4

Discussion in 'Moving Pictures' started by mitiok2008, February 14, 2009.

  1. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    With Windows I am unable to uzip this file. in 7zip the Russian text is all messed up.



    -LRFalk01
     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    I hope at least on of it you will be able to read :).
     

    Attached Files:

    • my movies.rar
      File size:
      30.6 KB
      Uploaded:
      February 20, 2009
      Views:
      105
    • my movies.zip
      File size:
      30.6 KB
      Uploaded:
      February 20, 2009
      Views:
      101
    • my movies.xml
      File size:
      30.6 KB
      Uploaded:
      February 20, 2009
      Views:
      145
  4. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    Hey mitiok2008,

    Sorry about that. Apparently the problem was Notepad++. In this xml, is it fair to assume that your file name is the translated title?

    -LRFalk01
     
  5. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    Nope! URL = filename. Translated title = Translated Title as is :).
     
  6. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    I've tried new version you've sent to me. It's better. Seems to me that encoding issue is solved - it search in Russian or in English. I think that the problem which is still exist is connection timeouts. Look at the log file - a lot of unanswered requests (or time outs). Moreover, there were very few movies found after the first run. Then I start to search movie by movie and in some cases it brings me correct results. In some not.
    So, I think that so far problem is not inside scraper, but the moving pictures itself.
     
  7. LRFalk01

    LRFalk01 Portal Pro

    Joined:
    August 27, 2007
    Messages:
    257
    Likes Received:
    92
    Ratings:
    +92 / 0
    Home Country:
    United States of America United States of America
    I will keep working on this mitiok2008. Thanks for helping me out with testing.

    -LRFalk01
     
  8. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    Thank you! I'll stay with MyFilms so far, I would love to use Moving Pictures. So I'm always ready to your tests.
     
  9. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    It's working !!! I think your last version is about pre-release (see below some ideas). I filled whole my database from kinopoisk.ru into Moving Pictures. Automatic file-name recognition was done for about 70-80% of movies. Rest 30% was about wired filenames, not about scraper. I put correct names of movie (doesn't matter Russian or Original) to manual search string - movie info was pulled out correctly!

    So, I think it's great job! THANK YOU!!!:D:D:D

    Here is some ideas how to make it perfect:
    1. It's good idea to increase timout for connection from 5 to 10. Maybe it should be adjustable value? I think that now it's working ok with my Inet connection, but maybe will not work with others? Anyway the site are working quite strange sometimes. It gives different results with the same queries. I think we just should live with it.

    2. Is it possible to have somewhere list of word which will be excluded from the search string? You see, there are some common words which are used for file-naming (it's usualy Ripper's names, etc). If it's possible to add them to the list the automatic search quality would be better.

    3. Seems to that kinoposik.ru doesn't like dates inside movies. It makes more complicate to recognise movie with date inside then without. I think there are should be some posibility to put digits to exclusion list as well (adjustable variable).

    Then the most interesting part about pictures.

    Cover-art
    I'm sure it's better to check first http://www.kinopoisk.ru/level/17/film/MOVIE-NUMBER/adv_type/cover/. In most cases you can find cover-art there. And only if you are not able to find it you cand go for posters http://www.kinopoisk.ru/level/17/film/MOVIE-Number/. Here you can find posters, which are in very often cases are not in appropriated dimensions (have a look on my exapmle, first poster is wider). if there are no any cover-art in /adv_type/cover/ section - OK, lets go for poster, - but it all other cases it's more appropriate covers. And, is it possible to load more then 3 cover-arts - quite often 3 is not enought (doesn't take nessesary one).

    Back-drops
    I'm really not sure if I should ask for it. You made to much for me already. But ok, I try ;).
    For backdrops two main alternatives:
    http://www.kinopoisk.ru/level/12/film/MOVIE-NUMBER/ - wallpapers or http://www.kinopoisk.ru/level/13/film/MOVIE-NUMBER/ - movie's shots. The best way to make this parameter adjustable. Ever, get backdrops from wallpapers, or from movie's shots. btw logic can be similar as above. If nothing is inside wallpaper, go to movie's shots, and only then go to posters if nothing is presented.

    That's it! LRFalk01, it doesn't matter will you have enough time to finalize scraper - it's already not bad. But if you would be able to make it perfect ;););) :D:D:D

    P.S.: I went to advanced setting for Moving Pictures and found some useful settings. For example, number of Cover-Art files and noise filter (it would be great to have Noise Filter TAB into regular setting pages - not easy to add new filter). But anyway at least two questions already answered.
     
  10. mitiok2008

    mitiok2008 Portal Pro

    Joined:
    February 1, 2009
    Messages:
    115
    Likes Received:
    1
    Ratings:
    +1 / 0
    I think it more or less finalized. Thank to LRFalk01, who actually made the scraper and to ffrode, who helped me a lot to make small polishing of the scraper.:D:D:D

    Scraper info
    It pulls information from KINOPOISK.RU. All details in Russian. Cover-arts is taken from two sections : first from cover-arts (обложки), second from posters (постеры). Backdrops (fan-art) - currently you need ever to load it manually and add to Moving picture or ... see hint #1.

    Hints
    1. It's better to store your movies in separate folders called "Original or Russian movie's name". Torrents files in russian torrents are named really weird. You can go to Advanced setting \ movie importer\local media parser\Regular expression noise filter but you will have part of the result. You will need to choose your movie from the list into Moving Picture Configuration. It's really hard to understand a lot of different ways for translitaration English-to-Russian even for good seach engine on kinopoisk.ru. So, just put your movie to separate folder, call it "Мадагаскар 2" - and in most cases you will have fully automatic detection in MovPict. And you put in this folder backdrop file, call it backdrop.jpg - that's it, you have nice background (fan-art).
    2. Kinopoisk is well known as sluggish site. Sometimes it doesn't provide appropriate results, espacially if you have a slow internet connection. Or for example you connection is full of loaded torrents. So just try to load details again. finally you will have it - it's not a problem of scraper.
    3. Just backdrop loading will be available from scrapers - the scraper will be upgraded to get pictures from the site.
     

    Attached Files:

    • Kinopoisk.xml
      File size:
      30.6 KB
      Uploaded:
      March 5, 2009
      Views:
      165
    • Like Like x 1
  11. fforde

    fforde Community Plugin Dev

    Joined:
    June 7, 2007
    Messages:
    2,666
    Likes Received:
    1,690
    Occupation:
    Software Engineer
    Location:
    Texas
    Ratings:
    +1,696 / 0
    Home Country:
    United States of America United States of America
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!