- June 13, 2008
- 18
- 2
- Home Country
- Germany
Summary:This patch adds a regex replace function to the html parser (indirect to webepg). The regex replaces will be done on the page source.
Problem:
There are situations, where a template don't match to the whole grabbing page because of some inconsistency on this page. The template goes through the page source and does the work till the inconsistency comes. After this template doesn't much anymore and ignores the rest of this page.
Inconsistency example:
Solution:
A solution for this problem is to replace/remove this inconsistency before parsing the page (send the page to the template parser).
Since there has been no way for it, I have implemented it myself.
Solution example:
How this works:
Now you can add the following new tags to your grabber file.
Legends:
match: The part you want to replace. (Uses RegExp)
replace: The text you want to replace with. (no RegExp)
1. The replace function will do all the replace tasks in the order you write them in your file, from up to down.
2. This replacements will be done before the template parsing, but after the page was cut (template-start, template-end). So the replacement affects only the important part of the page.
3. Because of point 2 and because of the actual structure of “htmlparser” and “webepg” code, the replaces-tags must be placed inside the template-tag.
Example Grabber File:
(see the attached file)
Git Pull request:
URL: git_pull_request
I hope you would add this code to the svn/git, and we can see that in the next MediaPortal release 1.2.3.
Edit: Added Binarys for Test (copy them to Team MediaPortal\MediaPortal TV Server directory)
Problem:
There are situations, where a template don't match to the whole grabbing page because of some inconsistency on this page. The template goes through the page source and does the work till the inconsistency comes. After this template doesn't much anymore and ignores the rest of this page.
Inconsistency example:
page source:
template:
result:
Code:
...
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div><div>Inconsistency</div>
<div>Time</div><div>ProgramName</div>
…
Code:
<div>#TIME</div><div>#NAME</div>
Code:
1. #TIME:Time; #NAME:ProgramName
2. #TIME:Time; #NAME:ProgramName
3. #TIME:Time; #NAME:ProgramName
4. #TIME:Inconsistency; #NAME:Time ⇒ Error
Solution:
A solution for this problem is to replace/remove this inconsistency before parsing the page (send the page to the template parser).
Since there has been no way for it, I have implemented it myself.
Solution example:
page source:
template:
replace:
page source before parsing:
result:
Code:
...
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div><div>Inconsistency</div>
<div>Time</div><div>ProgramName</div>
…
template:
Code:
<div>#TIME</div><div>#NAME</div>
replace:
Code:
<replace match=”<div>Inconsistency</div>” replace=”” />
page source before parsing:
Code:
...
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div>
<div>Time</div><div>ProgramName</div>
…
result:
Code:
1. #TIME:Time; #NAME:ProgramName
2. #TIME:Time; #NAME:ProgramName
3. #TIME:Time; #NAME:ProgramName
4. #TIME:Time; #NAME:ProgramName
How this works:
Now you can add the following new tags to your grabber file.
Code:
<replaces>
<replace match=”” replace=”” />
<replace match=”” replace=”” />
...
</replaces>
Legends:
match: The part you want to replace. (Uses RegExp)
replace: The text you want to replace with. (no RegExp)
1. The replace function will do all the replace tasks in the order you write them in your file, from up to down.
2. This replacements will be done before the template parsing, but after the page was cut (template-start, template-end). So the replacement affects only the important part of the page.
3. Because of point 2 and because of the actual structure of “htmlparser” and “webepg” code, the replaces-tags must be placed inside the template-tag.
Code:
<template>
<replaces>
…
</replaces>
</template>
(see the attached file)
Git Pull request:
URL: git_pull_request
I hope you would add this code to the svn/git, and we can see that in the next MediaPortal release 1.2.3.
Edit: Added Binarys for Test (copy them to Team MediaPortal\MediaPortal TV Server directory)
Attachments
Last edited by a moderator: