Hi folks,
I created my first site parser for video mediaset, it's a beta, here the code:
It works but I need your help.
In some pages part of the html source code is hidden probably because generated by javascript at run time, than I'm not able to get the list of videos.
Some example:
http://www.video.mediaset.it/programma/motogp/archivio-video.shtml
or
http://www.video.mediaset.it/programma/superbike/archivio-video.shtml
and in all pages where there are a combobox. In the search link as well:
http://www.video.mediaset.it/ricerca/ricerca.shtml?q={0}, for example:
http://www.video.mediaset.it/ricerca/ricerca.shtml?q=iene
All suggestions will really appreciated, otherwise I have to think to write a siteutil in c#.
Thank you and enjoy with the first part of video mediaset.
I created my first site parser for video mediaset, it's a beta, here the code:
XML:
<Site name="Video Mediaset" util="GenericSite" agecheck="false" enabled="true" lang="it">
<Configuration>
<item key="dynamicSubCategoriesNextPageRegEx"><![CDATA[<a\stitle="Vai\salla\spagina\ssuccessiva"\shref="(?<url>[^"]*)">»</a>]]></item>
<item key="videoListRegEx"><![CDATA[<a\stitle="(?<Title>[^"]*)"\shref="(?<VideoUrl>[^"]*)"\srel="nofollow"><img\salt="(?<Description>[^"]*)"\ssrc="(?<ImageUrl>[^"]*)"></a>]]></item>
<item key="nextPageRegEx"><![CDATA[<a\stitle="Vai\salla\spagina\ssuccessiva"\shref="(?<url>[^"]*)">»</a>]]></item>
<item key="playlistUrlRegEx"><![CDATA[var\svideoMetadataId\s=\s'(?<url>[^']*)';]]></item>
<item key="playlistUrlFormatString"><![CDATA[http://lazzavd.byethost11.com/script/vd.php?id={0}]]></item>
<item key="fileUrlRegEx"><![CDATA[<video\ssrc="(?<m0>[^"]*\.(?<n0>[mp4]*))"/>|<video\ssrc="(?<m0>[^"]*\.(?<n0>[wmv]*))"/>]]></item>
</Configuration>
<Categories>
<Category xsi:type="RssLink" name="Puntate Intere">http://www.video.mediaset.it/puntate-intere/puntate-intere.shtml</Category>
<Category xsi:type="RssLink" name="Clip Intrattenimento">http://www.video.mediaset.it/clip/intrattenimento.shtml</Category>
<Category xsi:type="RssLink" name="Clip News">http://www.video.mediaset.it/clip/news.shtml</Category>
<Category xsi:type="RssLink" name="Clip Sport">http://www.video.mediaset.it/clip/sport.shtml</Category>
<Category xsi:type="RssLink" name="Più visti di ieri">http://www.video.mediaset.it/piu-visti/piuvisti-ieri.shtml</Category>
<Category xsi:type="RssLink" name="Più visti della settimana">http://www.video.mediaset.it/piu-visti/piuvisti-settimana.shtml</Category>
<Category xsi:type="RssLink" name="Più visti del mese">http://www.video.mediaset.it/piu-visti/piuvisti-mese.shtml</Category>
</Categories>
</Site>
It works but I need your help.
In some pages part of the html source code is hidden probably because generated by javascript at run time, than I'm not able to get the list of videos.
Some example:
http://www.video.mediaset.it/programma/motogp/archivio-video.shtml
or
http://www.video.mediaset.it/programma/superbike/archivio-video.shtml
and in all pages where there are a combobox. In the search link as well:
http://www.video.mediaset.it/ricerca/ricerca.shtml?q={0}, for example:
http://www.video.mediaset.it/ricerca/ricerca.shtml?q=iene
All suggestions will really appreciated, otherwise I have to think to write a siteutil in c#.
Thank you and enjoy with the first part of video mediaset.
Last edited: