Proposal - The Adventures of Tintin

Summary of the Project

Our Minecraft-AI project will basically focus on biome recognition in Minecraft. Input an image that a Minecraft agent observes, our model will give a prediction about what kind of biome is in the the image. More specifically, we will train the model with many images that the Minecraft agent observes, let the model learn the pattern with these training images, use the model to predict test(new) images and see the expected types of output.

Biome in Minecraft

A biome is a region in Minecraft world with some specific geographical features such as flora and heights. Image below from Gamepedia shows a picture of a forest biome in Minecraft.

Our first step is to do binary classification on a specific kind of biome (such as ocean or forest). If time permits, we will make classification on a few different kinds of distinguishing biomes. Each biome have a main color, and information about the temperature.

Collect data

The data we need to collect is the images about the Biome. We will generate biomes with xml in Malmo to capture image data with labels for training and testing. We plan to have at least 500 samples as the training data and first make the samples balanced. If time permits, we will also test the imbalanced data if neeeded.

AI/ML Algorithms

Recognizing biome is an classification problem. Image recognition/classification is widely applied with Convolutional Neural Network(CNN). Besides CNN, we may also test the performance of SVM classifier, Random Forest, Gradient Boosting, because these are some good methods for solving classification problems. We are also open to other algorithms during research.

We plan to use TensorFlow framework and implement a Convolutional Neural Network(CNN) for image recognition. Some other possible frameworks are Caffe, CNTK and Scikit-Learn.

To accelerate the training process, we will use Amazon Web Service and Docker with GPU.

Evaluation Plan

Our evaluation plan will have two parts. One is the quantity evaluation and the other is the quality evaluation. For the quantity evaluation, we plan to use at least 200 image to test and see the correctness.For the quality evaluation, We expect to evaluate the project result based on the accuracy of the biome recognition. We generate different sets of training and testing data. Then we calcuate the error rate (provided by tensor flow) , which is a metric for binary classification, for our training data and testing data and plot graphs. We will compare the AUC of output with different parameters and choose the parameters that yields better performance.

References

http://minecraft.gamepedia.com/Biome

https://www.tensorflow.org/tutorials/layers

http://www.deeplearningbook.org/contents/convnets.html

http://cs231n.github.io/convolutional-networks/

https://www.youtube.com/watch?v=FmpDIaiMIeA