AutoCropper plugin (1 Viewer)

knutinh · August 11, 2006

Well, I could try recording some tonight. I can offer DVB-T and analog cable recordings (PAL). 30-60 seconds of each channel/format enough?

-k

knutinh · August 11, 2006

It would be cool contributing to testing different algos, but my contribution would in that case be MATLAB scripts and using native functions for loading one frame at a time (non-realtime).

Slow runtime, but very rapid development =)

-k

ziphnor · August 11, 2006

knutinh said:
Well, I could try recording some tonight. I can offer DVB-T and analog cable recordings (PAL). 30-60 seconds of each channel/format enough?

-k

At the point im at now i would actually prefer frame grabs, i really dont know how to load a video file frame-by-frame in a C# program. But please make the recordings(maybe use MPs converter to MPEG-2) then we can always grab frames later.

It seems ill have to learn DirectShow, damn....

knutinh · August 11, 2006

ziphnor said:
It seems ill have to learn DirectShow, damn....

I have found a plugin for visual studio that automtically defines a DirectShow framework Source, renderer, In-place or Transform filter. It basically lets you fill in the core algorithm. Provided, of course, that you dont need any non-standard communication with the outside.

DirectShow is complex yes. But do you really need it to test algos? I am shure that some C# library can utilise windows libs to load one MPEG2 frame at a time?? MATLAB can, although you'd better grab a can of your favourite beverage while witing for the result ;-)

-k

ziphnor · August 11, 2006

knutinh said:
I have found a plugin for visual studio that automtically defines a DirectShow framework Source, renderer, In-place or Transform filter. It basically lets you fill in the core algorithm. Provided, of course, that you dont need any non-standard communication with the outside.

Thats interesting, where did you find it?

DirectShow is complex yes. But do you really need it to test algos? I am shure that some C# library can utilise windows libs to load one MPEG2 frame at a time?? MATLAB can, although you'd better grab a can of your favourite beverage while witing for the result ;-)

From my quick google it seems that frame grabbing in C# is not that simple after all. Anyway if the most likely implementation is as an in-place transform filter(its probably not actually going to transform anything, but just scan a frame and then call somewhere else in MP for the cropping), i might as well start there. It seems that there is an interface called ISampleGrabber in C++, which basicly gives you a call back with the frame data. Its covered here:

http://www.codeproject.com/audio/VideoAnaFramework.asp

It might be sensible enough to do it in C++ in this manner(i dont mind,especially if a wrapper takes care of all the WinAPI/COM crap).

Alternative i can just extend this filter base class:
http://msdn.microsoft.com/library/d...s/directshow/htm/ctransinplacefilterclass.asp

In that way i can test the filter using GraphEdit on video files, and it can then plug straight into MP afterwards.

Im open to suggestions, but i think im going to try one of the above if nothing better shows up(they also have the extra benefit that they might help me implement the DS graph thread priority i was looking to implement a while ago).

jawbroken · August 11, 2006

I replied in the other thread with an algorithm idea, I think discussion is more suited to here where it will not pollute the ratio thread which is slightly different. I will repost here to centralise discussion, if that is okay.

Thank you for the code, I will need to dust off my copy of Visual Studio as I have been writing mostly Java code for uni at the moment. Your image code should be a helpful starting point to get me up to scratch. I have been thinking that the best way to find corners would be to start in, say, the top left corner and scan down diagonally, say in units of 4 pixels, then backtrack when you find interesting pixels to find the edge. Then you can figure out if it is a horizontal or vertical edge by sampling around the area. Once it has been determined to be horizontal or vertical, the edge can be followed to the corner. Then the same steps can be followed from the bottom right corner. Then it only remains to scan the subtitle area, which is best done in its own special way, I think, as it differs greatly from regular video sections.

A few issues:
1) Finding the corner will only work for frames with a clear contrast between the black sections and the frame. If this is running as a DS filter then this is not a big issue, as any frames not usable to find bounds can be ignored until a good frame comes along.
2) Ignoring subtitle sections and logos when the diagonal scan is being done. This test can probably be incorporated into the test to find out if it is a horizontal or vertical edge. So a Region of Interest can be a horizontal edge, a vertical edge or neither (neither being a logo or subtitle).
3) Some stuff I probably haven't even considered yet (special cases, etc).

I think that this method, while being much more specialised and sectionalised than your approach, could yield a significant decrease in pixel tests.

Any input/ideas or an entirely different approach that is more efficient would be welcome.

jawbroken · August 11, 2006

Does a DirectShow filter process video as frames or as a stream?

If it is a stream then my algorithm is not suited as it requires backtracking so a buffer would be required.

I have a good stream based algorithm in mind if stream data is available. It would be good to figure out exactly how image data will be presented so that an algorithm can be developed to take advantage of that.

ziphnor · August 11, 2006

jawbroken said:
Thank you for the code, I will need to dust off my copy of Visual Studio as I have been writing mostly Java code for uni at the moment. Your image code should be a helpful starting point to get me up to scratch. I have been thinking that the best way to find corners would be to start in, say, the top left corner and scan down diagonally, say in units of 4 pixels, then backtrack when you find interesting pixels to find the edge. Then you can figure out if it is a horizontal or vertical edge by sampling around the area. Once it has been determined to be horizontal or vertical, the edge can be followed to the corner. Then the same steps can be followed from the bottom right corner. Then it only remains to scan the subtitle area, which is best done in its own special way, I think, as it differs greatly from regular video sections.

Doesnt scanning diagonally put you in danger of 'discovering' logos?(many channels have logos in the top left corner). I think its best to scan down the middle since thats where the 'action' is.

I would prefer to decrease pixel checks by means of binary search. Ie, start by checking the middle of the screen, if it has image content, try 1/4 up/down(depending on scan direction) etc. To further decrease time complexity i am considering the histogram technique suggested, but with one histogram per color component(ie YUV or RGB). The max number of different values over all components seen would then decide if it was image content or not, with an added check for occurrence of high brightness.

ziphnor · August 11, 2006

jawbroken said:
Does a DirectShow filter process video as frames or as a stream?

A transform filter seem to be handed individual franes to its Transform method:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwmt/html/grabbersample.asp

jawbroken · August 11, 2006

ziphnor said:
Doesnt scanning diagonally put you in danger of 'discovering' logos?(many channels have logos in the top left corner). I think its best to scan down the middle since thats where the 'action' is.

I would prefer to decrease pixel checks by means of binary search. Ie, start by checking the middle of the screen, if it has image content, try 1/4 up/down(depending on scan direction) etc. To further decrease time complexity i am considering the histogram technique suggested, but with one histogram per color component(ie YUV or RGB). The max number of different values over all components seen would then decide if it was image content or not, with an added check for occurrence of high brightness.

Yes, it does put you in danger of discovering logos, but as these are small in terms of screen real estate they should be easy to discard. As I stated in my second point, this could probably be easily done in the same algorithm that determines if an edge is horizontal or vertical.

Perhaps your "binary search" method might work better, but I just get the impression that edge detection will need to examine less pixels than scans. I might try to knock something together over the weekend and perhaps collect some data on number of pixels examined and frame processing time.

I also have another idea to do with the stream algorithm, which may be fast even for frame data because of caching and linear access.

AutoCropper plugin (1 Viewer)

knutinh

Portal Pro

knutinh

Portal Pro

ziphnor

Retired Team Member

knutinh

Portal Pro

ziphnor

Retired Team Member

jawbroken

Portal Pro

jawbroken

Portal Pro

ziphnor

Retired Team Member

ziphnor

Retired Team Member

jawbroken

Portal Pro

Users who are viewing this thread