Scraper request - www.kinopoisk.ru [RU] (1 Viewer)

LRFalk01

Portal Pro
August 27, 2007
257
92
38
Home Country
United States of America United States of America
With Windows I am unable to uzip this file. in 7zip the Russian text is all messed up.

-LRFalk01
 

LRFalk01

Portal Pro
August 27, 2007
257
92
38
Home Country
United States of America United States of America
Hey mitiok2008,

Sorry about that. Apparently the problem was Notepad++. In this xml, is it fair to assume that your file name is the translated title?

-LRFalk01
 

mitiok2008

Portal Pro
February 1, 2009
115
1
Hey mitiok2008,

Sorry about that. Apparently the problem was Notepad++. In this xml, is it fair to assume that your file name is the translated title?

-LRFalk01

I've tried new version you've sent to me. It's better. Seems to me that encoding issue is solved - it search in Russian or in English. I think that the problem which is still exist is connection timeouts. Look at the log file - a lot of unanswered requests (or time outs). Moreover, there were very few movies found after the first run. Then I start to search movie by movie and in some cases it brings me correct results. In some not.
So, I think that so far problem is not inside scraper, but the moving pictures itself.
 

LRFalk01

Portal Pro
August 27, 2007
257
92
38
Home Country
United States of America United States of America
I will keep working on this mitiok2008. Thanks for helping me out with testing.

-LRFalk01
 

mitiok2008

Portal Pro
February 1, 2009
115
1
I will keep working on this mitiok2008. Thanks for helping me out with testing.

-LRFalk01

It's working !!! I think your last version is about pre-release (see below some ideas). I filled whole my database from kinopoisk.ru into Moving Pictures. Automatic file-name recognition was done for about 70-80% of movies. Rest 30% was about wired filenames, not about scraper. I put correct names of movie (doesn't matter Russian or Original) to manual search string - movie info was pulled out correctly!

So, I think it's great job! THANK YOU!!!:D:D:D

Here is some ideas how to make it perfect:
1. It's good idea to increase timout for connection from 5 to 10. Maybe it should be adjustable value? I think that now it's working ok with my Inet connection, but maybe will not work with others? Anyway the site are working quite strange sometimes. It gives different results with the same queries. I think we just should live with it.

2. Is it possible to have somewhere list of word which will be excluded from the search string? You see, there are some common words which are used for file-naming (it's usualy Ripper's names, etc). If it's possible to add them to the list the automatic search quality would be better.

3. Seems to that kinoposik.ru doesn't like dates inside movies. It makes more complicate to recognise movie with date inside then without. I think there are should be some posibility to put digits to exclusion list as well (adjustable variable).

Then the most interesting part about pictures.

Cover-art
I'm sure it's better to check first http://www.kinopoisk.ru/level/17/film/MOVIE-NUMBER/adv_type/cover/. In most cases you can find cover-art there. And only if you are not able to find it you cand go for posters http://www.kinopoisk.ru/level/17/film/MOVIE-Number/. Here you can find posters, which are in very often cases are not in appropriated dimensions (have a look on my exapmle, first poster is wider). if there are no any cover-art in /adv_type/cover/ section - OK, lets go for poster, - but it all other cases it's more appropriate covers. And, is it possible to load more then 3 cover-arts - quite often 3 is not enought (doesn't take nessesary one).

Back-drops
I'm really not sure if I should ask for it. You made to much for me already. But ok, I try ;).
For backdrops two main alternatives:
http://www.kinopoisk.ru/level/12/film/MOVIE-NUMBER/ - wallpapers or http://www.kinopoisk.ru/level/13/film/MOVIE-NUMBER/ - movie's shots. The best way to make this parameter adjustable. Ever, get backdrops from wallpapers, or from movie's shots. btw logic can be similar as above. If nothing is inside wallpaper, go to movie's shots, and only then go to posters if nothing is presented.

That's it! LRFalk01, it doesn't matter will you have enough time to finalize scraper - it's already not bad. But if you would be able to make it perfect ;););) :D:D:D

P.S.: I went to advanced setting for Moving Pictures and found some useful settings. For example, number of Cover-Art files and noise filter (it would be great to have Noise Filter TAB into regular setting pages - not easy to add new filter). But anyway at least two questions already answered.
 

mitiok2008

Portal Pro
February 1, 2009
115
1
I think it more or less finalized. Thank to LRFalk01, who actually made the scraper and to ffrode, who helped me a lot to make small polishing of the scraper.:D:D:D

Scraper info
It pulls information from KINOPOISK.RU. All details in Russian. Cover-arts is taken from two sections : first from cover-arts (обложки), second from posters (постеры). Backdrops (fan-art) - currently you need ever to load it manually and add to Moving picture or ... see hint #1.

Hints
  1. It's better to store your movies in separate folders called "Original or Russian movie's name". Torrents files in russian torrents are named really weird. You can go to Advanced setting \ movie importer\local media parser\Regular expression noise filter but you will have part of the result. You will need to choose your movie from the list into Moving Picture Configuration. It's really hard to understand a lot of different ways for translitaration English-to-Russian even for good seach engine on kinopoisk.ru. So, just put your movie to separate folder, call it "Мадагаскар 2" - and in most cases you will have fully automatic detection in MovPict. And you put in this folder backdrop file, call it backdrop.jpg - that's it, you have nice background (fan-art).
  2. Kinopoisk is well known as sluggish site. Sometimes it doesn't provide appropriate results, espacially if you have a slow internet connection. Or for example you connection is full of loaded torrents. So just try to load details again. finally you will have it - it's not a problem of scraper.
  3. Just backdrop loading will be available from scrapers - the scraper will be upgraded to get pictures from the site.
 

Attachments

  • Kinopoisk.xml
    30.6 KB

Users who are viewing this thread

Top Bottom