Paddle_baseline_KDD2019

Paddle baseline for KDD2019 "Context-Aware Multi-Modal Transportation Recommendation"(https://dianshi.baidu.com/competition/29/question)

This repository is the demo codes for the KDD2019 "Context-Aware Multi-Modal Transportation Recommendation" competition using PaddlePaddle. It is writen by python and uses PaddlePaddle to solove the task. Note that this repository is on developing and welcome everyone to contribute. The current baseline solution codes can get 0.68 - 0.69 score of online submission. As an example, my submission based on these networks programmed by PaddlePaddle is 0.6898. The reason of the publication of this baseline codes is to encourage us to use PaddlePaddle and build the most powerful recommendation model via PaddlePaddle.

The example codes are ran on Linux, single machine. Note that distributed train options are not provided here, if you want to learn more about this, please check more modes examples on https://github.com/PaddlePaddle/models. About the speed of training, for one epoch, 1000 batch size, it would take about 8 mins to train the whole training instances generated from raw data using SGD optimizer (it would take relatively longer using Adam optimizer).

The configuration and process of all the networks are fundamental, a lot of optimizations can be done based on them to achieve better results e.g. better cost function, more powerful feature engineering, designed model validation, NN optimization tricks...

The code is rough and from my daily use. They will be trimmed these days...

Install PaddlePaddle

please visit the official site of PaddlePaddle(http://www.paddlepaddle.org/documentation/docs/zh/1.4/beginners_guide/install/index_cn.html)

preprocess feature

python preprocess.py # change for different feature strategy

preprocess.py and preprocess_dense.py is the code for preprocessing the raw data. Two versions are provided to deal with all sparse features and sparse plus dense features. Correspondly, pre_process_test.py and pre_test_dense.py are the codes to preproccess test raw data. The training instances are saved in json. It is very easy to add new features. In our demo, all features are generated from provided raw data except for weather feature, which is gengerted from open weather records.

build the network

main network logic is in network_confv?.py. The networks are base on fm & deep related algorithms. I try sereval networks and public some of them. There may be some defects in the networks but all of them are functional.

train the network

python local_train.py

In local_train.py and map_reader.py, I use dataset API, so we need to download the correspond .whl package or clone codes on develop branch of PaddlePaddle. The reason to use this is the speed of feeding data is much more faster.

test results

python generate_test.py
python build_submit.py

In generate_test.py and build_submit, for convenience, I use the whole train data to train the network and test the network with provided data without label

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paddle_baseline_KDD2019

Install PaddlePaddle

preprocess feature

build the network

train the network

test results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data_set_phase1		data_set_phase1
networks		networks
README.md		README.md
args.py		args.py
build_submit.py		build_submit.py
generate_test.py		generate_test.py
infer.py		infer.py
local_train.py		local_train.py
map_reader.py		map_reader.py
pre_process_test.py		pre_process_test.py
pre_test_dense.py		pre_test_dense.py
preprocess.py		preprocess.py
preprocess_dense.py		preprocess_dense.py
weather.json		weather.json

for-competition/Paddle_baseline_KDD2019

Folders and files

Latest commit

History

Repository files navigation

Paddle_baseline_KDD2019

Install PaddlePaddle

preprocess feature

build the network

train the network

test results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages