preprocessing + inference + postprocessing = 30ms with fp32 on Tesla P40. The Tensorflow implementation is tensorflow_PSENet.
- Generating
.wts
fromTensorflow
. - Dynamic batch and dynamic shape input.
- Object-Oriented Programming.
- Practice with C++ 11.
-
- generate .wts
Download pretrained model from https://github.com/liuheng92/tensorflow_PSENet and put
model.ckpt.*
tomodel
dir. Add a filemodel/checkpoint
with contentmodel_checkpoint_path: "model.ckpt" all_model_checkpoint_paths: "model.ckpt"
Then run
python gen_tf_wts.py
which will gengerate a
psenet.wts
. -
- cmake and make
mkdir build cd build cmake .. make
-
- build engine and run detection
cp ../psenet.wts ./ cp ../test.jpg ./ ./psenet -s // serialize model to plan file ./psenet -d // deserialize plan file and run inference"
- The output of network is not completely the same as the tf's due to the difference between tensorrt's
addResize
andtf.image.resize
, I will figure it out.
- use
ExponentialMovingAverage
weight. - faster preporcess and postprocess.