AutoCropper plugin (1 Viewer)

ziphnor · August 11, 2006

jawbroken said:
I might try to knock something together over the weekend and perhaps collect some data on number of pixels examined and frame processing time.

You do that, but there is a good chance it will have to be done in a C++ DS filter, which you might want to keep in mind. That is keep it very simple to allow code to be copy-pasted later

Im going to try to set up the framework for a Directshow filter, and then we can plugin one or both algorithms later. I still feel that its overcomplicating things actually scanning into logos. But its good to have several methods explored.

jawbroken · August 13, 2006

I finally got onto this today (I have been working hard at work and university because I am going travelling in Europe at the end of the year and need the money).

I have coded up a generic framework for testing and benchmarking. It should be clear how to use it from the Main method. The main methods that need to be filled in are the isInterestingRow and isInterestingColumn functions. I have coded up the routines to do the binary search for the edges. The binary search currently looks between the edge of the frame and 25% in, this can be made configurable or changed if required. From my tests with 4:3 in 16:9 and 16:9 in 4:3 mockups I created, this 25% should be sufficient.

I have coded in very simplistic isInterestingRow and isInterestingColumn methods, they work fine for normal frames but from what I can see your variance method works a lot better. They won't pick up subtitles, etc but will ignore logos. If I have time I will plug in your variance code tonight.

With my simplistic testing (which is also not very efficient), I am getting promising time results. For example:

Starting Benchmark:
Results:
10 frames processed
Took 30.0432 ms
Average of 3.00432 ms per frame
That equates to 332.854023539437 fps

This is running on my old Mobile P4 1.7Ghz with 224MB of RAM.

So this routine could probably be run every 4 frames or so with negligible CPU impact.

Edit: I have emailed the framework to you.

ziphnor · August 13, 2006

jawbroken said:
This is running on my old Mobile P4 1.7Ghz with 224MB of RAM.

So this routine could probably be run every 4 frames or so with negligible CPU impact.

Edit: I have emailed the framework to you.

Excellent. I guess it shouldnt be that surprising that it can be done on the fly, after all many of the other filters process the entire image every frame. When implemented as a DS filter in C++ it will be very efficient.

In fact, when i thought some more about, i even think it might be possible to automaticaly move subtitles in the black bars into the video image. By scanning lines it shouldne be that hard to make a seperate bounding box for the subtitles, and finding the pixels containing text is just taking all the VERY bright pixels and 'move' them up into video letterbox. But lets get the simple things running first

Right now my focus is trying to create the filter by extending the CTransInPlaceFilter filter class. It seems all i have to do is implement a method for handling a frame (Transform method) and CheckInputType or similar for checking which input types are accepted(for example the cropper shouldnt accept audio

. Im thinking it might be possible to actually crop the video directly, so that no extra aspect ratio support will be needed, However, my biggest problem now is actually neither the detection algorithm or understanding directshow, but just getting a DirectShow development environment up and running with VS 2005. Youll probably see me asking questions about this in a seperate thread if i dont figure it out soon.

mzemina · August 13, 2006

I hope you guys are releasing the source code. Just to insure, well just-in-case some one can't work on it due to personal reasons, at least some one else maybe able to be able pick it up and continue on.

Mike

ziphnor · August 13, 2006

Some progress in DS build setup. There is a Direct Show filter wizard for VS 2003 that can set up a in place transform filter project automaticaly. Unfortunately it doesnt work in 2005. So i had to install 2003 and create it there. Then i couldnt compile it however, BUT now 2005 could succesfully compile the project it couldnt setup itself

I just looove C++ linker/build path crap

I think its a REALLY good thing that most of MP is C#.

I loaded up GraphEdit and asked it to set up a graph for rendering a MPEG-2 file, I then tried to insert the new filter(it does nothing right now) after the MPEG-2 decoder. It failed. I then changed CheckInputType to accept anything(it only accepted RGB before, not YUV), and then i could insert it, and play the file through it. So my tasklist now is:

1. Figure out the color spaces that needs to be supported(probably YUV?)
2. Figure out how YUV data is stored in the supplied buffer.
3. Implement algorithm in YUV context.

mzemina said:
I hope you guys are releasing the source code. Just to insure, well just-in-case some one can't work on it due to personal reasons, at least some one else maybe able to be able pick it up and continue on.

I can assure you that the sourcecode will be released, most likely submitted straight into MP's codebase. I have limited time to work on this myself, and will probably make it work for my own (ie 16:9 display) uses and leave it there(ah okay, maybe ill try for automaticly moving the subtitles). Because of that i also aim to make certain that someone else can keep working on it.

tourettes · August 14, 2006

ziphnor said:
1. Figure out the color spaces that needs to be supported(probably YUV?)
2. Figure out how YUV data is stored in the supplied buffer.
3. Implement algorithm in YUV context.

At least YUV needs to be supported, but I would assume that some RGB modes would need also supporting (YUV mixing mode in VMR9 isn't available in older Windows versions than XP SP2).

In SVN there is some YUV mode color handling in the old DVB subtitle transform filter -> MediaPortal\Filters\SubTrans\source\SubTransform.cpp

jawbroken · August 14, 2006

Sounds good. Since moving to Java/C# I can't stand the C++/C linkers and build tools either.

YUV is probably a good format to work with as you get the brightness for "free".

Have you had a chance to take a look at my binary search framework yet? I haven't had time to understand your code and plug in your variance based code yet.

I did, however, notice you are using random sample points for your algorithm. I don't know if this was a temporary thing, but I think making a deterministic sampling pattern would be best. This could just be one with equi-distance points, but I think I have a better idea. The sample points should be weighted so that content in the middle of the screen is sampled more than the edges. Because subtitles appear in the middle of the row, and logos appear on the edges, this should be an ideal pattern. I will figure out a mathematical way to calculate a good distribution shortly.

jawbroken · August 14, 2006

Okay, I have something that works well, I think. It will be easy enough to code as a lookup array once a number of sample points is decided upon.

I started with a cubic, but that was too flat around the mid point. So I used the inverse sin function, which is not too flat, but still clusters points in the centre of the screen.

For, say, number of samples = 21 (n=21):

f

= arcsin((n-10)/10)*0.8/pi + 0.5

For n is from 0 to 21, f

ranges from 0.1 to 0.9 (to avoid sampling out of frame or disinteresting areas).
To adapt for a different number of samples the (n-10)/10 part is changed to give a range from -1 to 1 for n from 0 to maxN.

For above:
f

= [0.10000, 0.21485, 0.26387, 0.30255, 0.33613, 0.36667, 0.39521, 0.42241, 0.44872, 0.47449, 0.50000, 0.52551, 0.55128, 0.57759, 0.60479, 0.63333, 0.66387, 0.69745, 0.73613, 0.78515, 0.90000]

Then all you need to do is multiple by the width or height and truncate or round to an integer to get the sample pixel. For example, looking at a 720x576 frame, sample points along the row will mean multiplying by 720, giving points at:
sample points = [72, 155, 190, 218, 242, 264, 285, 304, 323, 342, 360, 378, 397, 416, 435, 456, 478, 502, 530, 565, 648]

or for columns, multiply by 576 giving:
sample points = [72, 155, 190, 218, 242, 264, 285, 304, 323, 342, 360, 378, 397, 416, 435, 456, 478, 502, 530, 565, 648]

jawbroken · August 14, 2006

Math is nice and all, but a picture is worth a thousand words:

For example, if you were checking to see if a row was interesting, the sample points for each row are the blue lines on this picture:

Same thing for columns:

And both together just for fun:

ziphnor · August 14, 2006

tourettes said:
At least YUV needs to be supported, but I would assume that some RGB modes would need also supporting (YUV mixing mode in VMR9 isn't available in older Windows versions than XP SP2).

The Intervideo codec wants to output "MEDIASUBTYPE_YUY2" on all the samples i have tried. Is that the standard used in all MPEG-2 material/MPEG-2 decoders, or will i need to code up support for a host of different color spaces? I guess i would have to support more if its supposed to also work after recordings have been compressed to another format....

In SVN there is some YUV mode color handling in the old DVB subtitle transform filter -> MediaPortal\Filters\SubTrans\source\SubTransform.cpp

Thanks, ill take a look at it. Btw, how is the DVB subtitle development going?

AutoCropper plugin (1 Viewer)

ziphnor

Retired Team Member

jawbroken

Portal Pro

ziphnor

Retired Team Member

mzemina

Retired Team Member

ziphnor

Retired Team Member

tourettes

Retired Team Member

jawbroken

Portal Pro

jawbroken

Portal Pro

jawbroken

Portal Pro

ziphnor

Retired Team Member

Users who are viewing this thread