diff --git a/README.md b/README.md index 3d88779..65043c8 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,23 @@ # Mask RCNN Mask RCNN in TensorFlow -This repo attempts to reproduce this amazing work by Kaiming He. -[Mask RCNN](https://arxiv.org/abs/1703.06870). + +This repo attempts to reproduce this amazing work by Kaiming He et al. : +[Mask R-CNN](https://arxiv.org/abs/1703.06870) + +## Requirements + +- [Tensorflow (>= 1.0.0)](https://www.tensorflow.org/install/install_linux) +- [Numpy](https://github.com/numpy/numpy/blob/master/INSTALL.rst.txt) +- [COCO dataset](http://mscoco.org/dataset/#download) +- [Resnet50](http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz) ## How-to -1. Download [COCO](http://mscoco.org/dataset/#download) dataset, place it into `./data`, then run `python download_and_convert_data.py` to build tf-records. It takes a while. -2. Download pretrained resnet50 model, `wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz`, unzip it, place it into `./data/pretrained_models/` -3. Go to `./libs` and run `make` -4. run `python train/train.py` for training -5. There are certainly some bugs, please report them back, and let's solve them together. +1. Go to `./libs/datasets/pycocotools` and run `make` +2. Download [COCO](http://mscoco.org/dataset/#download) dataset, place it into `./data`, then run `python download_and_convert_data.py` to build tf-records. It takes a while. +3. Download pretrained resnet50 model, `wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz`, unzip it, place it into `./data/pretrained_models/` +4. Go to `./libs` and run `make` +5. run `python train/train.py` for training +6. There are certainly some bugs, please report them back, and let's solve them together. ## TODO: - [x] ROIAlign @@ -29,4 +38,12 @@ This repo attempts to reproduce this amazing work by Kaiming He. - Anything helps this repo, including **discussion**, **testing**, **promotion** and of course **your awesome code**. ## Acknowledgment -This repo borrows tons of code from [TFFRCNN](https://github.com/CharlesShang/TFFRCNN), [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn), [faster_rcnn](https://github.com/ShaoqingRen/faster_rcnn), [tf-models](https://github.com/tensorflow/models) +This repo borrows tons of code from +- [TFFRCNN](https://github.com/CharlesShang/TFFRCNN) +- [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn) +- [faster_rcnn](https://github.com/ShaoqingRen/faster_rcnn) +- [tf-models](https://github.com/tensorflow/models) + +## License +See [LICENSE](https://github.com/CharlesShang/FastMaskRCNN/blob/master/LICENSE) for details. + diff --git a/data/README.md b/data/README.md index e0871de..6f09a56 100644 --- a/data/README.md +++ b/data/README.md @@ -1,12 +1,9 @@ -Place your coco in this dir, like +Place and unzip your coco in this dir, like ```buildoutcfg ./data ./coco - ./train2014.zip - ./val2014.zip - ./instances_train-val2014.zip - ./person_keypoints_trainval2014.zip - ./captions_train-val2014.zip - + ./annotations + ./train2014 + ./val2014 ``` diff --git a/libs/boxes/bbox_transform.py b/libs/boxes/bbox_transform.py index fadd90c..e8c74e9 100644 --- a/libs/boxes/bbox_transform.py +++ b/libs/boxes/bbox_transform.py @@ -58,8 +58,12 @@ def bbox_transform_inv(boxes, deltas): pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis] pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis] - pred_w = np.exp(dw) * widths[:, np.newaxis] - pred_h = np.exp(dh) * heights[:, np.newaxis] + # pred_w = np.exp(dw) * widths[:, np.newaxis] + # pred_h = np.exp(dh) * heights[:, np.newaxis] + + pred_w = np.exp(dw + np.log(widths[:, np.newaxis])) + pred_h = np.exp(dh + np.log(heights[:, np.newaxis])) + pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype) # x1 diff --git a/train/train.py b/train/train.py index 839c178..cb04d8f 100644 --- a/train/train.py +++ b/train/train.py @@ -127,8 +127,8 @@ def train(): ## network logits, end_points, pyramid_map = network.get_network(FLAGS.network, image, - weight_decay=FLAGS.weight_decay) - outputs = pyramid_network.build(end_points, ih, iw, pyramid_map, + weight_decay=FLAGS.weight_decay, is_training=True) + outputs = pyramid_network.build(end_points, im_shape[1], im_shape[2], pyramid_map, num_classes=81, base_anchors=9, is_training=True,