Easy Dockerfile based apache Superset installation with custom DB driver

The easiest way to start with superset and custom database Driver (MySQL) without docker-compose.I was experiencing a great lesson-and-learn session with Superset installation with a custom DB driver. It drove me crazy. On the official web, is described https://superset.apache.org installation in two ways. The Installation from scratch. This uses a pip package installer in a python virtual environment. Don't do this if you want to not get frustrated. The Flask and other packages are highly dependent on concrete versions of packages. So Virtual environment is really needed and even so, The result is not so predictable. I don’t think you can do it on the first second or fifth time.

Superset in docker, docker-compose, and pip


The second described approach on https://superset.apache.org is much better. Install Superset using Docker compose. This approach is fine, even the Installation of an additional MySQL driver is in form of creating a requirements-local.txt file for docker-compose and rebuilding your local image. Then you need to just manage your Superset containers network to have visible MySQL. Superset itself needs to be available on the host so you are able to connect. So by following the official Docker compose installation approach and Adding New Database Drivers in Docker instructions. You can get into problems with your docker networking lack of knowledge.

Like the DB Driver was not installed, Superset not listening on the Port and responding, the Connection to MySQL failed, and the Network is unreachable or closed ports.

So, I will recommend for new installation the way not described on the official site.

How do I get Superset running on Google VM?

I assume that your remote machine or WSL 2 has installed docker and running docker daemon. You can check by the response to display containers. Docker ps

If you cannot connect to the docker daemon. You are not ready to go further.

My approach is described on the docker hub, where the official image of Apache Superset is stored. I wonder why this is not on the official website of Superset itself. https://hub.docker.com/r/apache/superset

For now follow me, and not instructions on the docker hub. We will extend this image as described at the bottom of the docker hub page first. We would like to add some MySQL clients.

Create Dockerfile from superset base image


Take the following text and place it into the Dockerfile. Just please create the empty folder for the docker file. /superset/Dockerfile
The content of the Dockerfile is simple:
FROM apache/superset
USER root
RUN pip install mysqlclient
RUN pip install sqlalchemy-redshift
USER superset
You can of course install any of these Database drivers from the official page: https://superset.apache.org/docs/databases/installing-database-drivers

Build new image with mysqlclient

I was a root user of the Google Cloud machine, So use SUDO if needed before my commands. Now let's build a new docker image based on the apache/superset base image as described in our Dockerfile.

docker build -t supers .

Include a dot at the end, so Dockerfile from the current directory will get into account. The name of the new image will be supers.


Start new container

It is time to run our container. So easy.

The docker run will start a container named superset from image supers, detached and on the same network as the host. So, We will not get to the problems with advanced networking for now.


Configure superset

The rest of the commands for configuration are on the docker hub.
$ docker exec -it superset superset fab create-admin \
--username admin \
--firstname Superset \
--lastname Admin \
--email admin@superset.com \
--password admin
$ docker exec -it superset superset db upgrade
$ docker exec -it superset superset load_examples
$ docker exec -it superset superset init
This is not the end of the journey. Your superset should be available on the same network as the host. So in google cloud on the same IP as your VM machine. http://localhost:8080/login

Install MySQL

Now install MySQL by the docker run command. I hope this command is OK. I have all others described in my documentation except this one.
$ docker run --detach --network host --name=minesql --env="MYSQL_ROOT_PASSWORD=XXXX" --publish 3306:3306 mysql --default-authentication-plugin=mysql_native_password
Once the container minesql is up and running. Just connect to the bash of the running container by following.
$ docker exec -it minesql bash -l

Go to MySQL by command inside the container.
$$ mysql -u root -p
You will get to the MySQL command line.
mysql> CREATE DATABASE xxx;
mysql> USE xxx;
mysql> CREATE TABLE yyy ...... .
Or do whatever you have to do in MySQL.

Connect Superset to DB

Superset installed, MySQL installed in docker. Both on the same host network see each other. Hosted in cloud VM or local WSL.

In Superset, web interface is just to select connect the database. Enter here the IP, where docker is running, localhost, WSL, or VM in GCP. The IP you can normally access by SSH is Putty. Use the port of the MySQL server. Enter the database name created above as xxx. Enter username and password (root and XXXX in our case).

This is all. Now fill your DB and create any dashboard application you like.

Conclusion

I do not recommend installing Superset by PIP at all. I do not recommend installing superset by docker-compose if you are not a little bit experienced with docker networking. I can recommend installing Superset with a custom DB driver from Dockerfile. Additionally, use the — network host when creating the Superset container and the MySQL containers. This will save you a lot of gray hairs during the installation of your development environment.
Next Post Previous Post
No Comment
Add Comment
comment url