Download link from this lesson: https://databank.worldbank.org/data/download/WDI_CSV.zip
Showing a Progress Bar
00:00 After you enable streaming, sometimes you might need to add some additional logic, for example showing download progress. This is where you move from just downloading a file to actively managing the data flow.
00:12 Let’s see how to track your progress as the file downloads.
00:16 When you iterate over chunks, you have the chance to count the bytes as they arrive. This is useful for creating progress bars or logging the status of a long-running download.
00:27
In this code, the new thing in addition to what we’ve seen in the previous lesson is the initialization of total_bytes. I initialize it to zero before the loop starts. Inside the loop, I just check if chunk to filter out any empty keep-alive chunks that might keep the connection open without sending data.
00:46
I write the data, then update my counter. The print() statement uses end="\r", which is a carriage return. This moves the cursor back to the start of the line instead of creating a new line, allowing the text to update in place so you see a running tally of the bytes downloaded.
01:03 As you’ve seen in the previous lesson, you can get the full size of the file that you’re about to download from the response headers. If you compare this full size to the current downloaded size, you can show your progress bar and you can log the download accordingly.
01:20 Now let’s add the progress bar in a code example. In the previous lesson, you used this code to download the file. Now we’ll add a few lines of code to show a simple progress bar so that you can visually see the download progress in the terminal while the file is being downloaded.
01:37
First, everything up to the request remains the same. You’re still using requests.get() with stream=True so the file is downloaded in chunks instead of all at once.
01:48
Next, you read the Content-Length value from the response headers. This tells us the total size of the file in bytes. Since headers are strings, you need to convert this value to an integer so you can do math with it.
02:01
Then, create a variable called downloaded_size and initialize it to zero. This will keep track of how many bytes you’ve downloaded so far. Now let’s define bar_width to 50.
02:16
bar_width defines how wide the progress bar will be in the terminal. Inside the loop, after you write each chunk to the file, increase downloaded_size by the size of the chunk you just received.
02:30
Then, calculate the progress by dividing the downloaded_size by the total file size. Since the downloaded file size will always be smaller than the total file size, this will give you a value between zero and one. Using this progress value, you calculate how many characters in the progress bar should be filled.
02:51 In this example, we’re representing the filled portion using hash symbols and the remaining portion uses dashes.
03:03
Let’s also convert the progress into a percentage by just multiplying it with 100 so it’s easier to read. Finally, the print() statement displays the progress bar and percentage.
03:21
The end parameter with the carriage return allows the output to update on the same line instead of printing a new line each time. This way, you can get real-time feedback during large downloads, which is very useful when working with big files or slow connections.
03:38 Now, if you’re running this code inside Python’s REPL like me, you’ll notice that instead of updating on the same line, the progress bar is printed on a new line every time a chunk is downloaded.
03:50
This happens because even though we’re using end with carriage return, the REPL treats each print() call as a new output line, so the progress bar appears stacked line by line.
04:02 This isn’t a problem with the code itself, it’s just a limitation of how the REPL displays output, so let’s run the exact same code in the VS Code terminal and see how the progress bar behaves in a proper terminal environment.
04:23 Here in the VS Code terminal, the progress bar updates smoothly on the same line, giving you a live view of the download progress as the file is being downloaded.
04:35 You might’ve noticed I used 100 kilobytes as the chunk size in this example. If you’re wondering why I chose that number and how it affects your script, that’s exactly what you’re going to learn in the next lesson.
Become a Member to join the conversation.
