Welcome to Part 5 of our Data Analysis with Python and Pandas tutorial series. In this tutorial, we're going to be covering how to combine dataframes in a variety of ways.
In our case with real estate investing, we're hoping to take the 50 dataframes with housing data and then just combine them all into one dataframe. We do this for multiple reasons. First, it is easier and just makes sense to combine these, but also it will result in less memory being used. Every dataframe has a date and value column. This date column is repeated across all the dataframes, but really they should all just share the one, effectively nearly halving our total column count.
When combining dataframes, you might have quite a few goals in mind. For example, you may want to "append" to them, where you may be adding to the end, basically adding more rows. Or maybe you want to add more columns, like in our case. There are four major ways of combining dataframes, which we'll begin covering now. The four major ways are: Concatenation, joining, merging, and appending. We'll begin with Concatenation.
Sample code and text-based version of this tutorial: pythonprogramming.net/concatenate-append-data-analysis-python-pandas-tutorial/
In this 16-video tutorial series from PythonProgramming.net, learn how to employ the Pandas library in Python to conduct data analysis operations. Pandas is a Python module, and Python is the programming language that we're going to use. The Pandas module is a high performance, highly efficient, and high level data analysis library.
At its core, it is very much like operating a headless version of a spreadsheet, like Excel. Most of the datasets you work with will be what are called dataframes. You may be familiar with this term already, it is used across other languages, but, if not, a dataframe is most often just like a spreadsheet. Columns and rows, that's all there is to it! From here, we can utilize Pandas to perform operations on our data sets at lightning speeds.