102->201 / NAS->autoDL / more configs of TAS / reorganize docs / fix bugs in NAS baselines

2020-01-15 00:52:06 +11:00
parent 33384a78af
commit bb2f405961
62 changed files with 789 additions and 412 deletions
--- a/docs/BASELINE.md
+++ b/docs/BASELINE.md
@@ -0,0 +1,136 @@
+# Basic Classification Models
+
+## Performance on CIFAR
+
+|        Model       | FLOPs       | Params (M) | Error on CIFAR-10 | Error on CIFAR-100 | Batch-GPU |
+|:------------------:|:-----------:|:----------:|:-----------------:|:------------------:|:---------:|
+| ResNet-08          |  12.50 M    |  0.08      |       12.14       |       40.20        |  256-2    |
+| ResNet-20          |  40.81 M    |  0.27      |       7.26        |       31.38        |  256-2    |
+| ResNet-32          |  69.12 M    |  0.47      |       6.19        |       29.56        |  256-2    |
+| ResNet-56          | 125.75 M    |  0.86      |       5.74        |       26.82        |  256-2    |
+| ResNet-110         | 253.15 M    |  1.73      |       5.14        |       25.18        |  256-2    |
+| ResNet-110         | 253.15 M    |  1.73      |       5.06        |       25.49        |  256-1    |
+| ResNet-164         | 247.65 M    |  1.70      |       4.36        |       21.48        |  256-2    |
+| ResNet-1001        | 1491.00 M   |  10.33     |       5.34        |       22.50        |  256-2    |
+| DenseNet-BC100-12  | 287.93 M    |  0.77      |       4.68        |       22.76        |  256-2    |
+| DenseNet-BC100-12  | 287.93 M    |  0.77      |       4.25        |       21.54        |  128-2    |
+| DenseNet-BC100-12  | 287.93 M    |  0.77      |       5.51        |       24.67        |   64-1    |
+| WRN-28-10          | 5243.33 M   |  36.48     |       3.61        |       19.65        |  256-2    |
+
+```
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 ResNet20  E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 ResNet56  E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 ResNet110 E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 ResNet164 E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 DenseBC100-12 E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/base-train.sh cifar10 WRN28-10  E300 L1 256 -1
+CUDA_VISIBLE_DEVICES=0,1 python ./exps/basic-eval.py --data_path ${TORCH_HOME}/ILSVRC2012 --checkpoint
+CUDA_VISIBLE_DEVICES=0,1 python ./exps/test-official-CNN.py --data_path ${TORCH_HOME}/ILSVRC2012
+```
+
+Train some NAS models:
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar10  SETN 96 -1
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar100 SETN 96 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k SETN  256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k SETN1 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k DARTS 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k GDAS_V1 256 -1
+```
+
+## Performance on ImageNet
+
+|        Model      | FLOPs (GB) | Params (M) | Top-1 Error | Top-5 Error |  Optimizer |
+|:-----------------:|:----------:|:----------:|:-----------:|:-----------:|:----------:|
+| ResNet-18         | 1.814      |  11.69     |   30.24     |   10.92     | Official   |
+| ResNet-18         | 1.814      |  11.69     |   29.97     |   10.43     | Step-120   |
+| ResNet-18         | 1.814      |  11.69     |   29.35     |   10.13     | Cosine-120 |
+| ResNet-18         | 1.814      |  11.69     |   29.45     |   10.25     | Cosine-120 B1024 |
+| ResNet-18         | 1.814      |  11.69     |   29.44     |   10.12     | Cosine-S-120 |
+| ResNet-18 (DS)    | 2.053      |  11.71     |   28.53     |   9.69      | Cosine-S-120 |
+| ResNet-34         | 3.663      |  21.80     |   25.65     |   8.06      | Cosine-120   |
+| ResNet-34 (DS)    | 3.903      |  21.82     |   25.05     |   7.67      | Cosine-S-120 |
+| ResNet-50         | 4.089      |  25.56     |   23.85     |   7.13      | Official     |
+| ResNet-50         | 4.089      |  25.56     |   22.54     |   6.45      | Cosine-120   |
+| ResNet-50         | 4.089      |  25.56     |   22.71     |   6.38      | Cosine-120 B1024 |
+| ResNet-50         | 4.089      |  25.56     |   22.34     |   6.22      | Cosine-S-120 |
+| ResNet-50 (DS)    | 4.328      |  25.58     |   22.67     |   6.39      | Step-120     |
+| ResNet-50 (DS)    | 4.328      |  25.58     |   21.94     |   6.23      | Cosine-120   |
+| ResNet-50 (DS)    | 4.328      |  25.58     |   21.71     |   5.99      | Cosine-S-120 |
+| ResNet-101        | 7.801      |  44.55     |   20.93     |   5.57      | Cosine-120   |
+| ResNet-101        | 7.801      |  44.55     |   20.92     |   5.58      | Cosine-120 B1024 |
+| ResNet-101 (DS)   | 8.041      |  44.57     |   20.36     |   5.22      | Cosine-S-120 |
+| ResNet-152        | 11.514     |  60.19     |   20.10     |   5.17      | Cosine-120 B1024 |
+| ResNet-152 (DS)   | 11.753     |  60.21     |   19.83     |   5.02      | Cosine-S-120 |
+| ResNet-200        | 15.007     |  64.67     |   20.06     |   4.98      | Cosine-S-120 |
+| Next50-32x4d (DS) | 4.2        |  25.0      |   22.2      |     -       | Official     |
+| Next50-32x4d (DS) | 4.470      |  25.05     |   21.16     |   5.65      | Cosine-S-120 |
+| MobileNet-V2      | 0.300      |  3.40      |   28.0      |     -       | Official     |
+| MobileNet-V2      | 0.300      |  3.50      |   27.92     |   9.50      | MobileFast   |
+| MobileNet-V2      | 0.300      |  3.50      |   27.56     |   9.26      | MobileFast-Smooth |
+| ShuffleNet-V2 1.0 | 0.146      |  2.28      |   30.6      |   11.1      | Official     |
+| ShuffleNet-V2 1.0 | 0.145      |  2.28      |             |             | Cosine-S-120 |
+| ShuffleNet-V2 1.5 | 0.299      |            |   27.4      |     -       | Official     |
+| ShuffleNet-V2 1.5 |            |            |             |             | Cosine-S-120 |
+| ShuffleNet-V2 2.0 |            |            |             |             | Cosine-S-120 |
+
+`DS` indicates deep-stem for the first convolutional layer.
+```
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet18V1 Step-Soft 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet18V1  Cos-Soft 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet18V1  Cos-Soft 1024 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet18V1  Cos-Smooth 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet18V2  Cos-Smooth 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet34V2  Cos-Smooth 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet50V1 Cos-Soft 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet50V2 Step-Soft 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet50V2 Cos-Soft 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNet101V2 Cos-Smooth 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh ResNext50-32x4dV2 Cos-Smooth 256 -1
+```
+
+Train efficient models may require different hyper-parameters.
+```
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh MobileNetV2-X MobileFast 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh MobileNetV2-X MobileFastS 256 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/base-imagenet.sh MobileNetV2   Mobile     256 -1 (70.96 top-1, 90.05 top-5)
+```
+
+# Train with Knowledge Distillation
+
+ResNet110 -> ResNet20
+```
+bash ./scripts-cluster/local.sh 0,1 "bash ./scripts/KD-train.sh cifar10 ResNet20 ResNet110 0.9 4 -1"
+```
+
+ResNet110 -> ResNet110
+```
+bash ./scripts-cluster/local.sh 0,1 "bash ./scripts/KD-train.sh cifar10 ResNet110 ResNet110 0.9 4 -1"
+```
+
+Set alpha=0.9 and temperature=4 following `Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR 2017`.
+
+# Linux
+The following command will redirect the output of top command to `top.txt`.
+```
+top -b -n 1 > top.txt
+```
+
+## Download the ImageNet dataset
+The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset has 1000 categories and 1.2 million images. The images do not need to be preprocessed or packaged in any database, but the validation images need to be moved into appropriate subfolders.
+
+1. Download the images from http://image-net.org/download-images
+
+2. Extract the training data:
+  ```bash
+  mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
+  tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
+  find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
+  cd ..
+  ```
+
+3. Extract the validation data and move images to subfolders:
+  ```bash
+  mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar
+  wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
+  ```
--- a/docs/CVPR-2019-GDAS.md
+++ b/docs/CVPR-2019-GDAS.md
@@ -0,0 +1,76 @@
+# [Searching for A Robust Neural Architecture in Four GPU Hours](https://arxiv.org/abs/1910.04465)
+
+<img align="right" src="https://d-x-y.github.com/resources/paper-icon/CVPR-2019-GDAS.png" width="300">
+
+Searching for A Robust Neural Architecture in Four GPU Hours is accepted at CVPR 2019.
+In this paper, we proposed a Gradient-based searching algorithm using Differentiable Architecture Sampling (GDAS).
+GDAS is baseed on DARTS and improves it with Gumbel-softmax sampling.
+Concurrently at the submission period, several NAS papers (SNAS and FBNet) also utilized Gumbel-softmax sampling. We are different at how to forward and backward, see more details in our paper and codes.
+Experiments on CIFAR-10, CIFAR-100, ImageNet, PTB, and WT2 are reported.
+
+
+## Requirements and Preparation
+
+Please install `Python>=3.6` and `PyTorch>=1.2.0`.
+
+CIFAR and ImageNet should be downloaded and extracted into `$TORCH_HOME`.
+
+### Usefull tools
+1. Compute the number of parameters and FLOPs of a model:
+```
+from utils import get_model_infos
+flop, param  = get_model_infos(net, (1,3,32,32))
+```
+
+2. Different NAS-searched architectures are defined [here](https://github.com/D-X-Y/AutoDL-Projects/blob/master/lib/nas_infer_model/DXYs/genotypes.py).
+
+
+## Usage
+
+### Reproducing the results of our searched architecture in GDAS
+Please use the following scripts to train the searched GDAS-searched CNN on CIFAR-10, CIFAR-100, and ImageNet.
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar10  GDAS_V1 96 -1
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar100 GDAS_V1 96 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k GDAS_V1 256 -1
+```
+If you are interested in the configs of each NAS-searched architecture, they are defined at [genotypes.py](https://github.com/D-X-Y/AutoDL-Projects/blob/master/lib/nas_infer_model/DXYs/genotypes.py).
+
+### Searching on the NASNet search space
+Please use the following scripts to use GDAS to search as in the original paper:
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/GDAS-search-NASNet-space.sh cifar10 1 -1
+```
+If you want to train the searched architecture found by the above scripts, you need to add the config of that architecture (will be printed in log) in [genotypes.py](https://github.com/D-X-Y/AutoDL-Projects/blob/master/lib/nas_infer_model/DXYs/genotypes.py).
+
+### Searching on a small search space (NAS-Bench-201)
+The GDAS searching codes on a small search space:
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/GDAS.sh cifar10 -1
+```
+
+The baseline searching codes are DARTS:
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V1.sh cifar10 -1
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V2.sh cifar10 -1
+```
+
+**After searching**, if you want to train the searched architecture found by the above scripts, please use the following codes:
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/NAS-Bench-201/train-a-net.sh '|nor_conv_3x3~0|+|nor_conv_3x3~0|nor_conv_3x3~1|+|skip_connect~0|skip_connect~1|skip_connect~2|' 16 5
+```
+`|nor_conv_3x3~0|+|nor_conv_3x3~0|nor_conv_3x3~1|+|skip_connect~0|skip_connect~1|skip_connect~2|` represents the structure of a searched architecture. My codes will automatically print it during the searching procedure.
+
+
+# Citation
+
+If you find that this project helps your research, please consider citing the following paper:
+```
+@inproceedings{dong2019search,
+  title     = {Searching for A Robust Neural Architecture in Four GPU Hours},
+  author    = {Dong, Xuanyi and Yang, Yi},
+  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+  pages     = {1761--1770},
+  year      = {2019}
+}
+```
--- a/docs/ICCV-2019-SETN.py
+++ b/docs/ICCV-2019-SETN.py
@@ -0,0 +1,50 @@
+# [One-Shot Neural Architecture Search via Self-Evaluated Template Network](https://arxiv.org/abs/1910.05733)
+
+<img align="right" src="https://d-x-y.github.com/resources/paper-icon/ICCV-2019-SETN.png" width="450">
+
+<strong>Highlight</strong>: we equip one-shot NAS with an architecture sampler and train network weights using uniformly sampling.
+
+One-Shot Neural Architecture Search via Self-Evaluated Template Network is accepted by ICCV 2019.
+
+
+## Requirements and Preparation
+
+Please install `Python>=3.6` and `PyTorch>=1.2.0`.
+
+### Usefull tools
+1. Compute the number of parameters and FLOPs of a model:
+```
+from utils import get_model_infos
+flop, param  = get_model_infos(net, (1,3,32,32))
+```
+
+2. Different NAS-searched architectures are defined [here](https://github.com/D-X-Y/AutoDL-Projects/blob/master/lib/nas_infer_model/DXYs/genotypes.py).
+
+
+## Usage
+
+Please use the following scripts to train the searched SETN-searched CNN on CIFAR-10, CIFAR-100, and ImageNet.
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar10  SETN 96 -1
+CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar100 SETN 96 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k SETN  256 -1
+```
+
+The searching codes of SETN on a small search space (NAS-Bench-201).
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/SETN.sh cifar10 -1
+```
+
+
+# Citation
+
+If you find that this project helps your research, please consider citing the following paper:
+```
+@inproceedings{dong2019one,
+  title     = {One-Shot Neural Architecture Search via Self-Evaluated Template Network},
+  author    = {Dong, Xuanyi and Yang, Yi},
+  booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
+  pages     = {3681--3690},
+  year      = {2019}
+}
+```
--- a/docs/NAS-Bench-201.md
+++ b/docs/NAS-Bench-201.md
@@ -0,0 +1,203 @@
+# [NAS-BENCH-201: Extending the Scope of Reproducible Neural Architecture Search](https://openreview.net/forum?id=HJxyZkBKDr)
+
+We propose an algorithm-agnostic NAS benchmark (NAS-Bench-201) with a fixed search space, which provides a unified benchmark for almost any up-to-date NAS algorithms.
+The design of our search space is inspired by that used in the most popular cell-based searching algorithms, where a cell is represented as a directed acyclic graph.
+Each edge here is associated with an operation selected from a predefined operation set.
+For it to be applicable for all NAS algorithms, the search space defined in NAS-Bench-201 includes 4 nodes and 5 associated operation options, which generates 15,625 neural cell candidates in total.
+
+In this Markdown file, we provide:
+- [How to Use NAS-Bench-201](#how-to-use-nas-bench-201)
+- [Instruction to re-generate NAS-Bench-201](#instruction-to-re-generate-nas-bench-201)
+- [10 NAS algorithms evaluated in our paper](#to-reproduce-10-baseline-nas-algorithms-in-nas-bench-201)
+
+Note: please use `PyTorch >= 1.2.0` and `Python >= 3.6.0`.
+
+Simply type `pip install nas-bench-201` to install our api.
+
+If you have any questions or issues, please post it at [here](https://github.com/D-X-Y/AutoDL-Projects/issues) or email me.
+
+### Preparation and Download
+
+The benchmark file of NAS-Bench-201 can be downloaded from [Google Drive](https://drive.google.com/open?id=1SKW0Cu0u8-gb18zDpaAGi0f74UdXeGKs) or [Baidu-Wangpan (code:6u5d)](https://pan.baidu.com/s/1CiaNH6C12zuZf7q-Ilm09w).
+You can move it to anywhere you want and send its path to our API for initialization.
+- v1.0: `NAS-Bench-201-v1_0-e61699.pth`, where `e61699` is the last six digits for this file. It contains all information except for the trained weights of each trial.
+- v1.0: The full data of each architecture can be download from [Google Drive](https://drive.google.com/open?id=1X2i-JXaElsnVLuGgM4tP-yNwtsspXgdQ) (about 226GB). This compressed folder has 15625 files containing the the trained weights.
+- v1.0: Checkpoints for 3 runs of each baseline NAS algorithm are provided in [Google Drive](https://drive.google.com/open?id=1eAgLZQAViP3r6dA0_ZOOGG9zPLXhGwXi).
+
+The training and evaluation data used in NAS-Bench-201 can be downloaded from [Google Drive](https://drive.google.com/open?id=1L0Lzq8rWpZLPfiQGd6QR8q5xLV88emU7) or [Baidu-Wangpan (code:4fg7)](https://pan.baidu.com/s/1XAzavPKq3zcat1yBA1L2tQ).
+It is recommended to put these data into `$TORCH_HOME` (`~/.torch/` by default). If you want to generate NAS-Bench-201 or similar NAS datasets or training models by yourself, you need these data.
+
+## How to Use NAS-Bench-201
+
+1. Creating an API instance from a file:
+```
+from nas_201_api import NASBench201API as API
+api = API('$path_to_meta_nas_bench_file')
+api = API('NAS-Bench-201-v1_0-e61699.pth')
+api = API('{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-Bench-201-v1_0-e61699.pth'))
+```
+
+2. Show the number of architectures `len(api)` and each architecture `api[i]`:
+```
+num = len(api)
+for i, arch_str in enumerate(api):
+  print ('{:5d}/{:5d} : {:}'.format(i, len(api), arch_str))
+```
+
+3. Show the results of all trials for a single architecture:
+```
+# show all information for a specific architecture
+api.show(1)
+api.show(2)
+
+# show the mean loss and accuracy of an architecture
+info = api.query_meta_info_by_index(1)  # This is an instance of `ArchResults`
+res_metrics = info.get_metrics('cifar10', 'train') # This is a dict with metric names as keys
+cost_metrics = info.get_comput_costs('cifar100') # This is a dict with metric names as keys, e.g., flops, params, latency
+
+# get the detailed information
+results = api.query_by_index(1, 'cifar100') # a dict of all trials for 1st net on cifar100, where the key is the seed
+print ('There are {:} trials for this architecture [{:}] on cifar100'.format(len(results), api[1]))
+print ('Latency : {:}'.format(results[0].get_latency()))
+print ('Train Info : {:}'.format(results[0].get_train()))
+print ('Valid Info : {:}'.format(results[0].get_eval('x-valid')))
+print ('Test  Info : {:}'.format(results[0].get_eval('x-test')))
+# for the metric after a specific epoch
+print ('Train Info [10-th epoch] : {:}'.format(results[0].get_train(10)))
+```
+
+4. Query the index of an architecture by string
+```
+index = api.query_index_by_arch('|nor_conv_3x3~0|+|nor_conv_3x3~0|avg_pool_3x3~1|+|skip_connect~0|nor_conv_3x3~1|skip_connect~2|')
+api.show(index)
+```
+
+5. For other usages, please see `lib/nas_201_api/api.py`
+
+
+### Detailed Instruction
+
+In `nas_201_api`, we define three classes: `NASBench201API`, `ArchResults`, `ResultsCount`.
+
+`ResultsCount` maintains all information of a specific trial. One can instantiate ResultsCount and get the info via the following codes (`000157-FULL.pth` saves all information of all trials of 157-th architecture):
+```
+from nas_201_api import ResultsCount
+xdata  = torch.load('000157-FULL.pth')
+odata  = xdata['full']['all_results'][('cifar10-valid', 777)]
+result = ResultsCount.create_from_state_dict( odata )
+print(result) # print it
+print(result.get_train())   # print the final training loss/accuracy/[optional:time-cost-of-a-training-epoch]
+print(result.get_train(11)) # print the training info of the 11-th epoch
+print(result.get_eval('x-valid'))     # print the final evaluation info on the validation set
+print(result.get_eval('x-valid', 11)) # print the info on the validation set of the 11-th epoch
+print(result.get_latency())           # print the evaluation latency [in batch]
+result.get_net_param()                # the trained parameters of this trial
+arch_config = result.get_config(CellStructure.str2structure) # create the network with params
+net_config  = dict2config(arch_config, None)
+network    = get_cell_based_tiny_net(net_config)
+network.load_state_dict(result.get_net_param())
+```
+
+`ArchResults` maintains all information of all trials of an architecture. Please see the following usages:
+```
+from nas_201_api import ArchResults
+xdata   = torch.load('000157-FULL.pth')
+archRes = ArchResults.create_from_state_dict(xdata['less']) # load trials trained with  12 epochs
+archRes = ArchResults.create_from_state_dict(xdata['full']) # load trials trained with 200 epochs
+
+print(archRes.arch_idx_str())      # print the index of this architecture 
+print(archRes.get_dataset_names()) # print the supported training data
+print(archRes.get_comput_costs('cifar10-valid')) # print all computational info when training on cifar10-valid 
+print(archRes.get_metrics('cifar10-valid', 'x-valid', None, False)) # print the average loss/accuracy/time on all trials
+print(archRes.get_metrics('cifar10-valid', 'x-valid', None,  True)) # print loss/accuracy/time of a randomly selected trial
+```
+
+`NASBench201API` is the topest level api. Please see the following usages:
+```
+from nas_201_api import NASBench201API as API
+api = API('NAS-Bench-201-v1_0-e61699.pth') # This will load all the information of NAS-Bench-201 except the trained weights
+api = API('{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-Bench-201-v1_0-e61699.pth')) # The same as the above line while I usually save NAS-Bench-201-v1_0-e61699.pth in ~/.torch/.
+api.show(-1)  # show info of all architectures
+api.reload('{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-BENCH-201-4-v1.0-archive'), 3) # This code will reload the information 3-th architecture with the trained weights
+
+weights = api.get_net_param(3, 'cifar10', None) # Obtaining the weights of all trials for the 3-th architecture on cifar10. It will returns a dict, where the key is the seed and the value is the trained weights.
+```
+
+
+## Instruction to Re-Generate NAS-Bench-201
+
+There are four steps to build NAS-Bench-201.
+
+1. generate the meta file for NAS-Bench-201 using the following script, where `NAS-BENCH-201` indicates the name and `4` indicates the maximum number of nodes in a cell.
+```
+bash scripts-search/NAS-Bench-201/meta-gen.sh NAS-BENCH-201 4
+```
+
+2. train earch architecture on a single GPU (see commands in `output/NAS-BENCH-201-4/BENCH-201-N4.opt-full.script`, which is automatically generated by step-1).
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/NAS-Bench-201/train-models.sh 0     0   389 -1 '777 888 999'
+```
+This command will train 390 architectures (id from 0 to 389) using the following four kinds of splits with three random seeds (777, 888, 999).
+
+|     Dataset     |     Train     |      Eval    |
+|:---------------:|:-------------:|:------------:|
+| CIFAR-10        | train         | valid / test |
+| CIFAR-10        | train + valid | test         |
+| CIFAR-100       | train         | valid / test |
+| ImageNet-16-120 | train         | valid / test |
+
+Note that the above `train`, `valid`, and `test` indicate the proposed splits in our NAS-Bench-201, and they might be different with the original splits.
+
+3. calculate the latency, merge the results of all architectures, and simplify the results.
+(see commands in `output/NAS-BENCH-201-4/meta-node-4.cal-script.txt` which is automatically generated by step-1).
+```
+OMP_NUM_THREADS=6 CUDA_VISIBLE_DEVICES=0 python exps/NAS-Bench-201/statistics.py --mode cal --target_dir 000000-000389-C16-N5
+```
+
+4. merge all results into a single file for NAS-Bench-201-API.
+```
+OMP_NUM_THREADS=4 python exps/NAS-Bench-201/statistics.py --mode merge
+```
+This command will generate a single file `output/NAS-BENCH-201-4/simplifies/C16-N5-final-infos.pth` contains all the data for NAS-Bench-201.
+This generated file will serve as the input for our NAS-Bench-201 API.
+
+[option] train a single architecture on a single GPU.
+```
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/NAS-Bench-201/train-a-net.sh resnet 16 5
+CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/NAS-Bench-201/train-a-net.sh '|nor_conv_3x3~0|+|nor_conv_3x3~0|nor_conv_3x3~1|+|skip_connect~0|skip_connect~1|skip_connect~2|' 16 5
+```
+
+
+## To Reproduce 10 Baseline NAS Algorithms in NAS-Bench-201
+
+We have tried our best to implement each method. However, still, some algorithms might obtain non-optimal results since their hyper-parameters might not fit our NAS-Bench-201.
+If researchers can provide better results with different hyper-parameters, we are happy to update results according to the new experimental results. We also welcome more NAS algorithms to test on our dataset and would include them accordingly.
+
+**Note that** you need to prepare the training and test data as described in [Preparation and Download](#preparation-and-download)
+
+- [1] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V1.sh cifar10 1 -1`, where `cifar10` can be replaced with `cifar100` or `ImageNet16-120`.
+- [2] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V2.sh cifar10 1 -1`
+- [3] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/GDAS.sh     cifar10 1 -1`
+- [4] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/SETN.sh     cifar10 1 -1`
+- [5] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/ENAS.sh     cifar10 1 -1`
+- [6] `CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/RANDOM-NAS.sh cifar10 1 -1`
+- [7] `bash ./scripts-search/algos/R-EA.sh -1`
+- [8] `bash ./scripts-search/algos/Random.sh -1`
+- [9] `bash ./scripts-search/algos/REINFORCE.sh 0.5 -1`
+- [10] `bash ./scripts-search/algos/BOHB.sh -1`
+
+In commands [1-6], the first args `cifar10` indicates the dataset name, the second args `1` indicates the behavior of BN, and the first args `-1` indicates the random seed.
+
+
+# Citation
+
+If you find that NAS-Bench-201 helps your research, please consider citing it:
+```
+@inproceedings{dong2020nasbench201,
+  title     = {NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search},
+  author    = {Dong, Xuanyi and Yang, Yi},
+  booktitle = {International Conference on Learning Representations (ICLR)},
+  url       = {https://openreview.net/forum?id=HJxyZkBKDr},
+  year      = {2020}
+}
+```
--- a/docs/NIPS-2019-TAS.md
+++ b/docs/NIPS-2019-TAS.md
@@ -0,0 +1,71 @@
+# [Network Pruning via Transformable Architecture Search](https://arxiv.org/abs/1905.09717)
+
+[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/network-pruning-via-transformable/network-pruning-on-cifar-100)](https://paperswithcode.com/sota/network-pruning-on-cifar-100?p=network-pruning-via-transformable)
+
+Network Pruning via Transformable Architecture Search is accepted by NeurIPS 2019.
+In this paper, we proposed a differentiable searching strategy for transformable architectures, i.e., searching for the depth and width of a deep neural network.
+You could see the highlight of our Transformable Architecture Search (TAS) at our [project page](https://xuanyidong.com/assets/projects/NeurIPS-2019-TAS.html).
+
+<p float="left">
+<img src="https://d-x-y.github.com/resources/paper-icon/NIPS-2019-TAS.png" width="680px"/>
+<img src="https://d-x-y.github.com/resources/videos/NeurIPS-2019-TAS/TAS-arch.gif?raw=true" width="180px"/>
+</p>
+
+
+## Requirements and Preparation
+
+Please install `Python>=3.6` and `PyTorch>=1.2.0`.
+
+CIFAR and ImageNet should be downloaded and extracted into `$TORCH_HOME`.
+The proposed method utilized knowledge distillation (KD), which require pre-trained models. Please download these models from [Google Drive](https://drive.google.com/open?id=1ANmiYEGX-IQZTfH8w0aSpj-Wypg-0DR-) (or train by yourself) and save into `.latent-data`.
+
+**LOGS**:
+We provide some logs at [Google Drive](https://drive.google.com/open?id=1_qUY4DTtuW_l6ZonynQAC9ttqy35fxZ-). It includes (1) logs of training searched shape of ResNet-18 and ResNet-50 on ImageNet, (2) logs of searching and training for ResNet-164 on CIFAR, (3) logs of searching and training for ResNet56 on CIFAR-10, (4) logs of searching and training for ResNet110 on CIFAR-100.
+
+## Usage
+
+Use `bash ./scripts/prepare.sh` to prepare data splits for `CIFAR-10`, `CIFARR-100`, and `ILSVRC2012`.
+If you do not have `ILSVRC2012` data, pleasee comment L12 in `./scripts/prepare.sh`.
+
+args: `cifar10` indicates the dataset name, `ResNet56` indicates the basemodel name, `CIFARX` indicates the searching hyper-parameters, `0.47/0.57` indicates the expected FLOP ratio, `-1` indicates the random seed.
+
+**Model Configuration**
+
+The searched shapes for ResNet-20/32/56/110/164 and ResNet-18/50 in Table 3/4 in the original paper are listed in [`configs/NeurIPS-2019`](https://github.com/D-X-Y/AutoDL-Projects/tree/master/configs/NeurIPS-2019).
+
+**Search for the depth configuration of ResNet**
+```
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-depth-gumbel.sh cifar10 ResNet110 CIFARX 0.57 -1
+```
+
+**Search for the width configuration of ResNet**
+```
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-width-gumbel.sh cifar10 ResNet110 CIFARX 0.57 -1
+```
+
+**Search for both depth and width configuration of ResNet**
+```
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-shape-cifar.sh cifar10 ResNet56  CIFARX 0.47 -1
+```
+
+**Training the searched shape config from TAS:**
+If you want to directly train a model with searched configuration of TAS, try these:
+```
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/tas-infer-train.sh cifar10  C010-ResNet32 -1
+CUDA_VISIBLE_DEVICES=0,1 bash ./scripts/tas-infer-train.sh cifar100 C100-ResNet32 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/tas-infer-train.sh imagenet-1k ImageNet-ResNet18V1 -1
+CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/tas-infer-train.sh imagenet-1k ImageNet-ResNet50V1 -1
+```
+
+
+# Citation
+
+If you find that this project helps your research, please consider citing the following paper:
+```
+@inproceedings{dong2019tas,
+  title     = {Network Pruning via Transformable Architecture Search},
+  author    = {Dong, Xuanyi and Yang, Yi},
+  booktitle = {Neural Information Processing Systems (NeurIPS)},
+  year      = {2019}
+}
+```