[Finished] Scrapper improvements (1 Viewer)

morpheus_xx

Retired Team Member
  • Team MediaPortal
  • March 24, 2007
    12,073
    7,459
    Home Country
    Germany Germany
    I will check how to include your regexp without breaking all other variants. I have a test tool for some common patterns and will add your regexp and series folder structure.

    Can you please provide full example of real paths you use? Best would be some different episodes/seasons/series for testing
     

    riggnix

    Portal Pro
    September 8, 2009
    95
    25
    Home Country
    Austria Austria
    Thank you!

    An example would be:

    "G:\Serien\The Big Bang Theory\Staffel 1 [DEU-ENG][720p]\S01E01 Penny und die Physiker.mkv"
     

    morpheus_xx

    Retired Team Member
  • Team MediaPortal
  • March 24, 2007
    12,073
    7,459
    Home Country
    Germany Germany
    I modified your regexp a bit to support both S1E01 and 1x01 pattern:
    C#:
    new Regex(@"(?<series>[^\\]*)\\[^\\]*(?<seasonnum>\d+)[^\\]*\\S*(?<seasonnum>\d+)[EX](?<episodenum>\d+)*(?<episode>.*)\.", RegexOptions.IgnoreCase),
    So it matches my created examples:
    @"G:\Serien\The Big Bang Theory\Staffel 1 [DEU-ENG][720p]\S01E01 Penny und die Physiker.mkv" ,
    @"G:\Serien\The Big Bang Theory\Staffel 1 [DEU-ENG][720p]\1x01 Penny und die Physiker.mkv" ,

    I also tested my other path examples and they still work correctly, as your version doesn't match them.

    @breese I also enabled the "nnn" pattern to match your sample series.

    Can you please try the attached plugin (needs to be copied into MP2-Server\Plugins folder)? After replacement create a new share for your series and let it scan.
     

    Attachments

    • SeriesMetadataExtractor.7z
      9.5 KB

    jpichie

    Portal Pro
    May 12, 2011
    102
    8
    Home Country
    Canada Canada
    They also need a better movie scrapper, like rochess for MP1
    Where it groups the same movies in a series and numbers them.

    Example, Harry Potter 1 - blah blah
    Harry Potter 2 - ....
    And so on
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    They also need a better movie scrapper, like rochess for MP1
    Where it groups the same movies in a series and numbers them.

    If they add the MovPic scraper-engine to MP2, then I can make IMDb+ compatible. I recall reading intention to do so, but not sure what the current plans on that are.
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    The intention was mine - and I still have the intention. But while I was about to improve the importer process as such in terms of multitasking to achieve some speed improvements, we realized (again) that we hit some walls with our current implementation of the MediaLibrary in order to be able to model all the requests we have. So currently I am working on improving the MediaLibrary first in order to make all the requests possible (such as handling of multi-file videos, etc.). After that (or in connection with that) it is still planned to introduce multitasking to the importer and when this is done, there are plans to expand our scrapers (or as MP2 calls them MetadataExtractors). One of these enhancements may be a MetadataExtractor which introduces MocPics' scraper-engine to MP2. This is still a long way to go, but we were talking about all this for more than a year now and at least the process has started now. So stay tuned...
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    The intention was mine - and I still have the intention.

    Awesome, and no rush, I know all to well how the best intentions can be sidetracked by work/etc. IMDb+ was grown out of frustration with waiting on some features to be added to Moving-Pictures directly, such as ability to sort and possible 'group' movie series together, and preferably have a single "James Bond Collection" entry within "All Movies". Due to the limitations of IMDb+ being a scraper-script, there was no way for me to 'group' the movies into a single collection entry, but I was at least able to rename all the titles semi-automatic, so that they sort/group together indirectly.

    If the main functions are integrated directly into MP2, then the need for IMDb+ is not much there, except for a small detail that is a constant annoying battle. Amazon/IMDb constantly change stuff that make scraping their pages break down or get wrong data. If you look at the version history of IMDb+ on the scraper-script alone, you can get a quick idea on how often that is sometimes required. It is hard alone for getting it to remain functional for USA-IP users, let alone the international nature that I support with IMDb+. Thankfully I got quick feedback from the IMDb+ user community, and it is usually a quick fix on the regular expressions to make it all work again or fix minor bugs.

    The IMDb+ auto-update system then takes care of distributing this update, so most users (unless they import new movies often) sometimes do not even notice that a small bug existed, because they got the updated IMDb+ scraper-script installed before they need it.

    Now I can do that easy on the XML based IMDb+ scraper-script system, but the C# based import scripts still give me a little shiver sometimes in how to achieve the same results. The same goes for many other scraper-script contributors, which is why adding a MovPic alike scraper-system into MP2 will benefit the future support of all those different movie detail websites.

    Pretty sure you and I talked about that already, but others reading this thread might not remember/know.
     

    jpichie

    Portal Pro
    May 12, 2011
    102
    8
    Home Country
    Canada Canada
    @MJGraf has any movement been made on this front? For an improved movie scrapper?
    Just curious.
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Have at look at Github - branch MP2-422. Things are progressing nicely, but there is a lot of ground(re)work to do before we can actually implement new functionality. We hit some walls due to some less ideal concepts in the code so we first have to change the background before we can build new things upon that. As soon as the new Importer has - based on new technology- the same functionality as the old one, I'll post some more news on what has been done and why. I expect this in a few weeks.
    Michael
     

    Users who are viewing this thread

    Top Bottom