Machine Vision Box (Composed by Wangkang)

Computer Vision, Intelligent Algorithm

Video Surveillance System (Demo)

Filed under Computer Vision by Wang Kang on 06-05-2012

I designed an video surveillance system (demo for evaluation), you can download them in follow cloud storage :

Download the demo
Shared video clips [1], [2], [3]

—————————————————————————–

Notes :

1.you need connect the internet, my server will authorize the trial demo to run.

2. do not delete the ‘configure.xml’ file in the folder, you can find some settings in it.

3. if you can’t play the video file, you need install the codec.

—————————————————————————–

Snap 1: Multi-people tracking + Auto camera calibration ( create 3D scene and mapping the people on it , the background event-thread will do nonliear fit and RANSAC to estimate the camera KRT matrics by peoples)

Snap 2. Abandon object detection. In left-bottom window, the abandon object is hint by the red bounding box and print the stop time (minutes:seconds)

Snap 3. The new stopped car is also ‘abandon object’ because the original background model does not contain it, it can applied in the high-speed way

to detect the stopped car.

Video Stabilization Using Point Feature Matching

Filed under Computer Vision by Wang Kang on 05-10-2011

This article is taken from MathWorks ( Thanks MathWorks, its very nice article ):

-----------------------------------------

Stabilizing a video that was captured from a jittery or moving platform is an important application in computer vision. One way to stabilize a video is to track a salient feature in the image and use this as an anchor point to cancel out all perturbations relative to it. This procedure, however, must be bootstrapped with knowledge of where such a salient feature lies in the first video frame. In this demo, we explore a method of video stabilization that works without any such a priori knowledge. It instead automatically searches for the “background plane” in a video sequence, and uses its observed distortion to correct for camera motion.

This stabilization algorithm involves two steps. First, we determine the affine image transformations between all neighboring frames of a video sequence using a Random Sampling and Consensus (RANSAC) [1] procedure applied to point correspondences between two images. Second, we warp the video frames to achieve a stabilized video. We will use System objects in the Computer Vision System Toolbox™, both for the algorithm and for display.

This demo is similar to the Video Stabilization Demo. The main difference is that the Video Stabilization Demo is given a region to track while this demo is given no such knowledge. Both demos use the same example video.

Contents

Step 1. Read Frames from a Movie File

Here we read in the first two frames of a video sequence. We read them as intensity images since color is not necessary for the stabilization algorithm, and because using grayscale images improves speed. Below we show both frames side by side, and we produce a red-cyan color composite to demonstrate the pixel-wise difference between them. There is obviously a large vertical and horizontal offset between the two frames.

filename = 'shaky_car.avi';
hVideoSrc = vision.VideoFileReader(filename, ...
                                      'ImageColorSpace', 'Intensity',...
                                      'VideoOutputDataType', 'single');

imgA = step(hVideoSrc); % Read first frame into imgA
imgB = step(hVideoSrc); % Read second frame into imgB

cvexShowImagePair(imgA, imgB, 'Frame A', 'Frame B');

figure; imshow(cat(3,imgA,imgB,imgB));
title('Color composite (frame A = red, frame B = cyan)');

Step 2. Collect Salient Points from Each Frame

Our goal is to determine a transformation that will correct for the distortion between the two frames. We can use the GeometricTransformEstimator System object for this, which will return an affine transform. As input we must provide this object with a set of point correspondences between the two frames. To generate these correspondences, we first collect points of interest from both frames, then select likely correspondences between them.

In this step we produce these candidate points for each frame. To have the best chance that these points will have corresponding points in the other frame, we want points around salient image features such as corners. For this we use the CornerDetector System object. The FAST corner detector algorithm, which we have selected below is one of the fastest options.

The detected points from both frames are shown in the figure below. Observe how many of them cover the same image features, such as points along the tree line, the corners of the large road sign, and the corners of the cars.

maxPts = 150;
ptThresh = 1e-3;
hCD = vision.CornerDetector( ...
    'Method','Local intensity comparison (Rosen & Drummond)', ...
    'MaximumCornerCount', maxPts, ...
    'CornerThreshold', ptThresh, ...
    'NeighborhoodSize', [9 9]);
pointsA = step(hCD, imgA);
pointsB = step(hCD, imgB);

cvexShowImagePair(imgA, imgB, 'Corners in A', 'Corners in B', ...
    'SingleColor', pointsA, pointsB);

Step 3. Select Correspondences Between Points

Next we pick correspondences between the points derived above. For each point, we extract a 9-by-9 block centered around it. The matching cost we use between points is the sum of squared differences (SSD) between their respective image regions. Points in frame A and frame B are matched putatively. Note that there is no uniqueness constraint, so points from frame B can correspond to multiple points in frame A.

% Extract features for the corners
blockSize = 9; % Block size.
[featuresA, pointsA] = extractFeatures(imgA, pointsA, ...
    'BlockSize', blockSize);
[featuresB, pointsB] = extractFeatures(imgB, pointsB, ...
    'BlockSize', blockSize);

% Match features which were found in the current and the previous frames
indexPairs = matchFeatures(featuresA, featuresB, 'Metric', 'SSD');
pointsA = pointsA(indexPairs(:, 1), : );
pointsB = pointsB(indexPairs(:, 2), : );

The image below shows the same color composite given above, but added are the points from frame A in red, and the points from frame B in green. Yellow lines are drawn between points to show the correspondences selected by the above procedure. Many of these correspondences are correct, but there is also a significant number of outliers.

cvexShowMatches(imgA, imgB, pointsA, pointsB, 'A', 'B', 'RC');

Step 4. Estimating Transform from Noisy Correspondences

Many of the point correspondences obtained in the previous step are incorrect. But we can still derive a robust estimate of the geometric transform between the two images using the Random Sample Consensus (RANSAC) algorithm [1], which is implemented in the GeometricTransformEstimator System object. This object, when given a set of point correspondences, will search for the valid inlier correspondences. From these it will then derive the affine transform that makes the inliers from the first set of points match most closely with the inliers from the second set. This affine transform will be a 3-by-3 matrix of the form:

[a_1 a_3 t_r;
 a_2 a_4 t_c;
   0   0   1]

The parameters a define scale, rotation, and sheering effects of the transform, while the parameters t are translation parameters. This transform can be used to warp the images such that their corresponding features will be moved to the same image location.

A limitation of the affine transform is that it can only alter the imaging plane. Thus it is ill-suited to finding the general distortion between two frames taken of a 3-D scene, such as with this video taken from a moving car. But it does work under certain conditions that we shall describe shortly.

We implement this procedure below. For added robustness, we run the GeometricTransformEstimatorSystem object multiple times and calculate a cost for each result. This cost is obtained by projecting frame B onto frame A according to the derived transform, and taking the sum of absolute difference (SAD) between the two images. We take the best transform as the one that minimizes this cost.

hGTE = vision.GeometricTransformEstimator;
hGT = vision.GeometricTransformer;
hGTPrj = vision.GeometricTransformer;

% Run multiple RANSAC trials
nRansacTrials = 1;
Ts = cell(1,nRansacTrials);
costs = zeros(1,nRansacTrials);
nPts = int32(size(pointsA,2));
inliers = cell(1,nRansacTrials);

for j=1:nRansacTrials
    % Estimate affine transform
    [Ts{j},inliers{j}] = step(hGTE, pointsB, pointsA);

    % Warp image and compute error metric.
    imgBp = step(hGT, imgB, Ts{j});
    costs(j) = sum(sum(imabsdiff(imgBp, imgA)));
end
% Take best result.
[~,ix] = min(costs);
imgBp = step(hGT, imgB, Ts{ix});
pointsBp = [single(pointsB), ones(size(pointsB,1), 1)] * Ts{ix};
H = [Ts{ix} [0 0 1]'];

Below is a color composite showing frame A overlaid with the reprojected frame B, along with the reprojected point correspondences. The results are excellent, with the inlier correspondences nearly exactly coincident. The cores of the images are both well aligned, such that the red-cyan color composite becomes almost purely black-and-white in that region.

Note how the inlier correspondences are all in the background of the image, not in the foreground, which itself is not aligned. This is because the background features are distant enough that they behave as if they were on an infinitely distant plane. Thus, even though the affine transform is limited to altering only the imaging plane, here that is sufficient to align the background planes of both images. Furthermore, if we assume that the background plane has not moved or changed significantly between frames, then this transform is actually capturing the camera motion. Therefore correcting for this will stabilize the video. This condition will hold as long as the motion of the camera between frames is small enough, or, conversely, if the sample time of the video is high enough.

cvexShowMatches(imgA, imgBp, pointsA(inliers{ix}, : ), ...
    pointsBp(inliers{ix}, : ), 'A', 'B');

Step 5. Transform Approximation and Smoothing

Given a set of video frames $T_{i}, \quad i=0,1,2 \ldots$, we can now use the above procedure to estimate the distortion between all frames $T_i$ and $T_{i+1}$ as affine transforms, $H_i$. Thus the cumulative distortion of a frame $i$relative to the first frame will be the product of all the preceding inter-frame transforms, or

$H_{cumulative,i} = \prod_{j=0}^{i-1} H_i$

We could use all the six parameters of the affine transform above, but, for numerical simplicity and stability, we choose to re-fit the matrix as a simpler scale-rotation-translation transform. This has only four free parameters compared to the full affine transform’s six: one scale factor, one angle, and two translations. This new transform matrix is of the form:

[s*cos(ang) s*-sin(ang) t_x;
 s*sin(ang)  s*cos(ang) t_y;
          0           0   1]

We demonstrate this conversion procedure below by fitting the above-obtained transform $H$ with a scale-rotation-translation equivalent, $H_{sRt}$. To show that the error of converting the transform is minimal, we reproject frame B with both transforms and show the two images below as a red-cyan color composite. As the image appears black and white, obviously the pixel-wise difference between the different reprojections is negligible.

% Extract scale and rotation part sub-matrix.
R = H(1:2,1:2);
% Compute theta from mean of two possible arctangents
theta = mean([atan2(R(2),R(1)) atan2(-R(3),R(4))]);
% Compute scale from mean of two stable mean calculations
scale = mean(R([1 4])/cos(theta));
% Translation remains the same:
translation = H(3, 1:2);
% Reconstitute new s-R-t transform:
HsRt = [[scale*[cos(theta) -sin(theta); sin(theta) cos(theta)]; ...
  translation], [0 0 1]'];

imgBold = step(hGTPrj, imgB, H);
imgBsRt = step(hGTPrj, imgB, HsRt);
figure(2), clf;
imshow(cat(3,imgBold,imgBsRt,imgBsRt)), axis image;
title('Color composite of affine and s-R-t transform outputs');

Step 6. Run on the Full Video

Now we apply the above steps to smooth a video sequence. For readability, the above procedure of estimating the transform between two images has been placed in the MATLAB® functioncvexEstStabilizationTform. The function cvexTformToSRT also converts a general affine transform into a scale-rotation-translation transform.

At each step we calculate the transform $H$ between the present frames. We fit this as an s-R-t transform, $H_{sRt}$. Then we combine this the cumulative transform, $H_{cumulative}$, which describes all camera motion since the first frame. The last two frames of the smoothed video are shown in a Video Player as a red-cyan composite.

With this code, you can also take out the early exit condition to make the loop process the entire video.

% Reset the video source to the beginning of the file.
reset(hVideoSrc);
hGTE = vision.GeometricTransformEstimator;
hGT = vision.GeometricTransformer;
hGTPrj = vision.GeometricTransformer;

hVPlayer = vision.VideoPlayer; % Create video viewer

hCD = vision.CornerDetector( ...
    'Method','Local intensity comparison (Rosen & Drummond)', ...
    'MaximumCornerCount', maxPts, ...
    'CornerThreshold', ptThresh, ...
    'NeighborhoodSize', [9 9]);

% Process all frames in the video
movMean = step(hVideoSrc);
imgB = movMean;
imgBp = imgB;
correctedMean = imgBp;
ii = 2;
Hcumulative = eye(3);
while ~isDone(hVideoSrc) && ii < 10
    % Read in new frame
    imgA = imgB; % z^-1
    imgAp = imgBp; % z^-1
    imgB = step(hVideoSrc);
    movMean = movMean + imgB;

    % Estimate transform from frame A to frame B, and fit as an s-R-t
    H = cvexEstStabilizationTform(imgA,imgB,hCD,hGT,hGTE);
    HsRt = cvexTformToSRT(H);
    Hcumulative = HsRt * Hcumulative;
    imgBp = step(hGTPrj, imgB, Hcumulative);

    % Display as color composite with last corrected frame
    step(hVPlayer, cat(3,imgAp,imgBp,imgBp));
    correctedMean = correctedMean + imgBp;

    ii = ii+1;
end
correctedMean = correctedMean/(ii-2);
movMean = movMean/(ii-2);

% Here you call the release method on the objects to close any open files
% and release memory.
release(hVideoSrc);
release(hVPlayer);

During computation, we computed the mean of the raw video frames and of the corrected frames. These mean values are shown side-by-side below. The left image shows the mean of the raw input frames, proving that there was a great deal of distortion in the original video. The mean of the corrected frames on the right, however, shows the image core with almost no distortion. While foreground details have been blurred (as a necessary result of the car’s forward motion), this shows the efficacy of the stabilization algorithm.

cvexShowImagePair(movMean, correctedMean, ...
    'Raw input mean', 'Corrected sequence mean');

References

[1] Tordoff, B; Murray, DW. “Guided sampling and consensus for motion estimation.” European Conference n Computer Vision, 2002.

[2] Lee, KY; Chuang, YY; Chen, BY; Ouhyoung, M. “Video Stabilization using Robust Feature Trajectories.” National Taiwan University, 2009.

[3] Litvin, A; Konrad, J; Karl, WC. “Probabilistic video stabilization using Kalman filtering and mosaicking.” IS&T/SPIE Symposium on Electronic Imaging, Image and Video Communications and Proc., 2003.

[4] Matsushita, Y; Ofek, E; Tang, X; Shum, HY. “Full-frame Video Stabilization.” Microsoft® Research Asia. CVPR 2005.

Predator :-) = ( Tracking + Learning + Detection )

Filed under Computer Vision by Wang Kang on 12-07-2011

Recently I found a nice single object tracker named Predator.

Here’s the Introduce :

—————————————————————————————————————————————————————————————————————————————-

The Predator contains :

  • Tracking : use Backward-forward LK tracker, the highlight of this tracker is it can correct mismatching in optical flow. ( for example, the point A in image t-1 is matching to A’  in image t by LK,  we call this forward, and then backward from A’  in image t to A” in image t-1 , if  A” is not matching  orginal point A, we call this point is error matching, remove this point’s optical flow )
  • Learning : 1. When initialize, normalize the target to a small N*N patch, generate a lots of features ( random blocks in patch ), encode them by 2bitLP (haar-like features), then we get feature vector of this target to be positive examples. and we use feature vector generated from background (far from target) to be negative examples. we use those to initialize a random forest. 2. When tracking, we use P-N constraint to distinguish whether is unseen object appearence or background, to update the classifier.
  • Detection : use sliding window + integral image + 2bitLP (haar-like features)

—————————————————————————————————————————————————————————————————————————————-

Here’s examples :




Point Cloud Rendering

Filed under Computer Vision by Wang Kang on 10-07-2011

This video is reference in OpenCV jp.

Data set used in the program can be downloaded from here.

This is a demo program to project the 2D image pixel to 3D space parallax image.

————————————————————————————————————————————

Step 1. Reading the stereo image, semi-global block matching ( SGBM ) to calculate the disparity image.
Step 2. Re-projection to 3D space parallex image by Q-Matrix.
Step 3. Project the image points from the 3D coordinate to new perspective.
————————————————————————————————————————————

And you can find how to operate in here

————————————————————————————————————————————

[1] Makoto Ota, Norishige Fukushima, Tomohiro Yendo, Masayuki Tanimoto, Toshiaki Fujii, “Rectification of Pure Translation 2D Camera Array,” Proc. Of IWAIT2009, 0044, Jan. 2009.

About Tracking (1)

Filed under Computer Vision by Wang Kang on 06-01-2011

Why human eye can tracking object ? What’s algorithm the human brain run? Let’s talk about this.

1 . Segmentation . Human can easily distinguish and tracking target different with background but hardly tracking target similar to background, for example,  a polar bear on snow vs a a polar bear  on grass, people wear camouflage uniform lying in the grass vs lying in the snow.

2. Motion detection. Do you feel dizzy when you staring at a waving stick ?  People look a static background scene are more easily than watching a moving background scene, such as rotating car at playground can make people dizzy, the moving background scene need run complex motion detection algorithm  on brain.

Use OpenMP to let your program run faster(2) – Parallel Search

Filed under Computer Vision by Wang Kang on 30-12-2010

If you wanna search data in a disorder dataset, you can divide dataset to a lots of blocks, and use each thread to seach each block, here is a example :

Definition :

Input :       void **ppDataSet : A huge dataset which you wanna seach target data in it

Input:        int nLen : Dataset length

Input:        void *pTargetData Target data (You wanna search it in dataset and return it location)

Input:        CompareFunc Comp : Compare data (Matching datas to obtain similarity)

Output:     vector<int> VLoc : Target data locatoin in dataset

Sample Code:

vector<int>   ParallelSearchData( void** ppDataSet, int nLen, void* pTargetData, CompareFunc Comp )

{

.        int i, k;

.        int nCore   = omp_get_num_procs();

.        int nStep   =    nLen / Core;

.        vector<intVLoc;

#pragma omp parallel for

.        for ( k = 0 ; k < nCore ; k++ )

.        {

.                  int iBegin  = k * nStep;           // Each Processor Search  Start Loc

.                  int iEnd     =   (k+1) * nStep;    // Each Processor Search End Loc

.                 // Divide step cumulative may not accurate equal nLen (int type)

.                  if ( k = kCore – 1 )         iEnd =  nLen;        

.                  for ( i = iBegin ; i < iEnd ; i++ )

.                  {

.                          // We assume confidence = 1.0 is target

.                            if Comp( ppDataSet[i], pTargetData ) == 1.0 )  

.                                       VLoc.insert( i );

.                  }

.        }

.        return VLoc;

}

Use OpenMP to let your program run faster(1)

Filed under Computer Vision by Wang Kang on 30-12-2010

Nowadays, I encountered a problem when I designing a new real-time tracking algorithm in company, inevitable for loop and time-consuming operation make me upset, so I borrow a book from Lib and search multi-thread to solve this problem, now I find answer:

here is example :

int i;

int iLoopTimes    =    1000;

void Job()
{
for ( i = 0 ; i < iLoopTimes ; i++ )
{
IplImage*    pImage    =    cvCreateImage( cvSize(100,100), 8, 3 );
cvCvtColor( pImage, pImage, CV_BGR2HSV );
SAFE_FREE_CV_IMAGE( pImage );
}
}
void ParallelSection()
{
//////////////////////////////////////////////////////////////////////////
CLOCK_START

#pragma omp parallel
{
#pragma omp sections
{
#pragma omp section
Job();
#pragma omp section
Job();
}
#pragma omp sections
{
#pragma omp section
Job();
#pragma omp section
Job();
}
}
CLOCK_STOP
//////////////////////////////////////////////////////////////////////////

CLOCK_START
#pragma omp parallel sections
{
#pragma omp section
Job();
#pragma omp section
Job();
#pragma omp section
Job();
#pragma omp section
Job();
}
CLOCK_STOP

//////////////////////////////////////////////////////////////////////////
CLOCK_START
Job();
Job();
Job();
Job();
CLOCK_STOP
//////////////////////////////////////////////////////////////////////////

}

}

Time Usage Compare:

Test1  time = 628.708 ms (Double Double sections)
Test2  time = 831.781 ms (Four sections)
Test3  time = 2741.43 ms (Do not use OpenMP)

See ? My friend.

Solve Nonlinear function By Binary Divisive Procedure ( c++ code )

Filed under Computer Vision by Wang Kang on 11-07-2010

Here is the C++ Code Solve the Nonlinear function below

 

 Use the Binary Divisive Procedure

#include <math.h>
#include <stdio.h>
#include <iostream>
using namespace std;

double TestEquation( double x )
{
 double y;
 y = (((((x-5.0)*x+3.0)*x+1.0)*x-7.0)*x+7.0)*x-20.0;
 return y;
}
// Function : BinDivSearchRoot 
// Parameters:
//  LowerBound  Lower Bound of Root
//  UpperBound  Upper Bound of Root
//      So, This Function will Find Root in [a,b]
//  StepWidth  Progressive Step
//  Eps    Error Rande
//  Root[]   the Value of roots
//  RootCount  Expected number of Roots
//  Email
  Author Email : novertina@aol.com
int BinDivSearchRoot( double LowerBound, double UpperBound, double StepWidth, double Eps,double Root[], int RootCount, char* mail )
{
 int SearchedRoot, rootStatus;
 double x, y, x1, y1, x0, y0;
 SearchedRoot = 0; x = LowerBound;  y = TestEquation( x );

 while ( ( x <= UpperBound + StepWidth/2.0 )&&( SearchedRoot!=RootCount ) ) // Search root in Sub-interval
 {
  if ( fabs(y) < Eps ) // If this is a ROOT ! ;-)
  {
   SearchedRoot++;
   Root[SearchedRoot-1] = x;
   x += StepWidth/2.0;
   y = TestEquation( x );
  }
  else // Or Not a ROOT
  {
   x1 = x + StepWidth;
   y1 = TestEquation( x1 );
   if ( fabs(y1) < Eps ) // If next Step is a ROOT ! ;-)
   {
    SearchedRoot++;
    Root[SearchedRoot-1] = x1;
    x = x1 + StepWidth/2.0;
    y = TestEquation( x );
   }
   else if ( y * y1 >0.0 ) // NULL ROOT ;-(
   {
    y = y1;
    x = x1;
   }
   else // A ROOT in this Sub-interval !
   {
    rootStatus = 0 ;// Status : 1. Still Not Found
        //   2. Have Found

    while ( rootStatus == 0 )
    {
     if ( fabs( x1-x ) < Eps ) // Sub-Interval length is less than Given Eps, It can be used as a ROOT ! ;-)
     {
      SearchedRoot ++;
      Root[SearchedRoot-1] = (x1+x)/2.0; // Consider the ROOT is Median value of this Sub-Interval
      x = x1 + StepWidth/2.0; // In order to Find Root in next Sub-Interval, this Interval is clear !
      y = TestEquation( x );
      rootStatus = 1; // Don’t forget Set the Status OK !
     }
     else // Sub-Interval length is Bigger than Given Eps, Binary Divide it
     {
      x0 = (x1+x)/2.0;// Binary Divide it
      y0 = TestEquation( x0 );

      if ( fabs(y0) < Eps ) // Here is a ROOT !
      {
       Root[SearchedRoot] = x0;
       SearchedRoot ++;
       x = x0 + StepWidth/2.0;
       y = TestEquation( x );
       rootStatus = 1;
      }
      else if ( (y*y0) < 0.0 )// a ROOT exist in [y,y0]
      {
       x1 = x0;
       y1 = y0;
      }
      else
      {
       x = x0;
       y = y0;
      }
     }
    }
   }
  }
 }
 if ( strcmp(mail,”novertina@aol.com“) )
 {
  // Here is a Joke
  // Purpose is to prevent lazy students who just copy the code do not understand the meaning
  // And let them get Null result
  // Sorry ;-)
  return 0;
 }
 return(SearchedRoot);  // Return Root Count
}

void SolveFunctionMain()
{

 int    rootCount;
 static int  m = 6;
 static double x[6];
 double   LowerBound = -2.0;
 double   UpperBound =  5.0;
 double   StepWidth =  0.2;
 double   Eps   =  0.000001;
 char   mail[100] = ”novertina@aol.com“;

 cout<<”———   This is a program to solve Nonlinear function”<<endl;
 cout<<”———   By Binary Divisive Procedure”<<endl;
 cout<<endl;

 cout<<”———   The Nonlinear function is:”<<endl;
 cout<<”y = (((((x-5.0)*x+3.0)*x+1.0)*x-7.0)*x+7.0)*x-20.0″<<endl;
 cout<<endl;

 rootCount = BinDivSearchRoot( LowerBound, UpperBound, StepWidth, Eps, x, m, mail );

 cout<<”Solve Successfully!”<<endl;
 cout<<endl;

 cout<<”The function has “<<rootCount<<” roots,they are:”<<endl;
 for (int i = 0 ; i < rootCount ; i++ )
 {
  cout<<”x(“<<i<<”) = “<<x[i]<<endl;
 }
 
 system(“PAUSE”);
}

Happy New Year – The Tiger Year of China;-)

Filed under Computer Vision by Wang Kang on 14-02-2010

The Symbol on my head is my Family name;-)

Just like the Symbol on the head of Tiger

Happy New Year:-)

2

OpenCV memory leaking management in C/C++

Filed under Computer Vision by Wang Kang on 25-12-2009

If you’re new to OpenCV, you need to know exactly how to manage all the huge amounts of memory you’re using. C/C++ isn’t a garbage collected language (like Java), so you need to manually release memory as soon as its use is over. If you don’t, your program could use up hundreds of MBs of highly valuable RAM… and often even crash (out-of-memory errors?)

It can be a daunting task to hunt exactly where memory needs to be released. So I’ve compiled this short list of places where you should look out for memory leaks.

Create it, then Release it

If you create something, make sure you release it before “returning”. This is probably the very first thing you should check when fixing memory leak problems with OpenCV. For example, if you do a cvCreateImage, make sure you do a cvReleaseImage. There are many things you can create. Here are some functions that “create” and their corresponding “release” functions

cvCreateImage cvReleaseImage
cvCreateImageHeader cvReleaseImageHeader
cvCreateMat cvReleaseMat
cvCreateMatND cvReleaseMatND
cvCreateData cvReleaseData
cvCreateSparseMat cvReleaseSparseMat
cvCreateMemStorage cvReleaseMemStorage
cvCreateGraphScanner cvReleaseGraphScanner
cvOpenFileStorage cvReleaseFileStorage
cvAlloc cvFree

One warning though: If you create something and want to return it, don’t release it. Lets say a function that creates a checkerboard image and returns it. If you release the image before returning it, you’re freeing all memory that stores the image data. And when you try accessing memory that isn’t yours, you get a crash.

Release returned structures

This is the second thing you should check for. Often, once you return a structure (say, an image).. you forget about it.

Multiple Memory Allocations

This is the third thing you should check for: Allocating memory, and then changing the pointer itself. Here’s some example code:

view plaincopy to clipboardprint?
IplImage* image = cvCreateImage(whatever);
image = CreateCheckerBoard(whatever);

cvReleaseImage(ℑimage);
IplImage* image = cvCreateImage(whatever);
image = CreateCheckerBoard(whatever);

cvReleaseImage(ℑimage);

This function creates a memory leak. First, you allocate some memory for image . Then, you call the function CreateCheckerBoard. This function itself creates new memory. And image now points to this new memory. The memory created in the first step is lost forever. No variable points to it. A memory leak. To fix this, you need to modify the code like this:

view plaincopy to clipboardprint?
IplImage* image = NULL;
image = CreateCheckerBoard(whatever);

cvReleaseImage(ℑimage);
IplImage* image = NULL;
image = CreateCheckerBoard(whatever);

cvReleaseImage(ℑimage);

If you return a sequence, release its storage

There are many instances where you use the CvSeq data structure. And often you might want to return this structure for further use. If you release its storage (a CvMemStorage structure) within the function itself, you’d free the memory where the sequence is stored. And then you’d try and access it in the calling function. Again, crash.

A temporary fix would be to just erasing the cvReleaseMemStorage statement… but that would mean lots of memory.
To fix this, you don’t release the memory in the function itself. You release it in the calling function like this:

view plaincopy to clipboardprint?
cvReleaseMemStorage(&thesequence->storage);
cvReleaseMemStorage(&thesequence->storage);

storage is a member of the CvMemStorage structure that always points to the memory where its stored.

Again, this is just an example. There are more structures where a similar situation could arise.

Dependence on other structures

I quite recently discovered this memory leak. To explain this, I’ll use an example: Lets say you find out the contours of an image. OpenCV would return a “linked list” type structure calledCvSeq . You decide to access the third element of this linked list. OpenCV returns a pointer to the third element. All going great till this moment.

Now you decide to save all the points of this contour (the third element) in a data structure of your own. Since this is an array of points, you do something like:

view plaincopy to clipboardprint?
mystructure->points = thirdcontour->points;
mystructure->points = thirdcontour->points;

You set the pointer to equal to the thirdcontour . This is the bug. If you release the storage of the sequence (which you should), mystructure has a bad pointer. To fix this, allocate new memory to mystructure->points and then copy contents of thirdcontour->points … something like this:

view plaincopy to clipboardprint?
mystructure->points = (CvPoint*)malloc(sizeof(CvPoint) * thirdcontour->total);
memcpy(mystructure->points,thirdcontour->points,sizeof(CvPoint)*thirdcontour->total);
mystructure->points = (CvPoint*)malloc(sizeof(CvPoint) * thirdcontour->total);
memcpy(mystructure->points,thirdcontour->points,sizeof(CvPoint)*thirdcontour->total);

This creates new memory for your structure and then copies each element there. Once you’ve done this, you can release the storage of the sequence without fear.

[Original source from: LiquidMetal , all rights reserved by original authors]

Subscribe to RSS Feed Rss