Introduction to Bivariate Data
Types of bivariate analysis and what to do with the results. For example, the scatterplot below shows the relationship between the time. Bivariate analysis is one of the simplest forms of quantitative (statistical) analysis. It involves the It is the analysis of the relationship between the two variables. Bivariate analysis is a simple (two variable) special case of multivariate analysis . bivariate scatterplot. The most common means of doing this is the correlation coef - ficient (sometimes called Pearson's correlation coefficient): r = ∑ i. (Xi. − ¯X).
But this is weak. A lot of the data is off, well off of the line. But I'd say this is still linear. It seems that, as we increase one, the other one increases at roughly the same rate, although these data points are all over the place.
So, I would still call this linear. Now, there's also this notion of outliers. If I said, hey, this line is trying to describe the data, well, we have some data that is fairly off the line. So, for example, even though we're saying it's a positive, weak, linear relationship, this one over here is reasonably high on the vertical variable, but it's low on the horizontal variable. And so, this one right over here is an outlier. It's quite far away from the line.
You could view that as an outlier. And this is a little bit subjective. Outliers, well, what looks pretty far from the rest of the data? This could also be an outlier. Let me label these. Now, pause the video and see if you can think about this one.
Is this positive or negative, is it linear, non-linear, is it strong or weak? I'll get my ruler tool out here. So, this goes here.
It seems like I can fit a line pretty well to this. So, I could fit, maybe I'll do the line in purple. I could fit a line that looks like that. And so, this one looks like it's positive. As one variable increases, the other one does, for these data points.Univariate Analysis and Bivariate Analysis
So it's a positive. I'd say this was pretty strong.
- Bivariate relationship linearity, strength and direction
The dots are pretty close to the line there. It really does look like a little bit of a fat line, if you just look at the dots. So, positive, strong, linear, linear relationship.
And none of these data points are really strong outliers. This one's a little bit further out. But they're all pretty close to the line, and seem to describe that trend roughly. All right, now, let's look at this data right over here.
So, let me get my line tool out again.
So, it looks like I can fit a line. So it looks, and it looks like it's a positive relationship. The line would be upward sloping.
It would look something like this. And, once again, I'm eyeballing it. You can use computers and other methods to actually find a more precise line that minimizes the collective distance to all of the points, but it looks like there is a positive, but I would say, this one is a weak linear relationship, 'cause we have a lot of points that are far off the line.
So, not so strong. So, I would call this a positive, weak, linear relationship. And there's a lot of outliers here. This one over here is pretty far, pretty far out. Pause this video and think about, is it positive or negative, is strong or weak? Is this linear or non-linear? Well, the first thing we wanna do is let's think about it with linear or non-linear.
I could try to put a line on it. But if I try to put a line on it, it's actually quite difficult. If I try to do a line like this, you'll notice everything is kind of bending away from the line.
It looks like, generally, as one variable increases, the other variable decreases, but they're not doing it in a linear fashion. It looks like there's some other type of curve at play. So, I could try to do a fancier curve that looks something like this, and this seems to fit the data a lot better. So this one, I would describe as non-linear. And it is a negative relationship.
Another example of information not available from the separate descriptions of husbands and wives' ages is the mean age of husbands with wives of a certain age.
For instance, what is the average age of husbands with year-old wives? Finally, we do not know the relationship between the husband's age and the wife's age. We can learn much more by displaying the bivariate data in a graphical form that maintains the pairing. Figure 2 shows a scatter plot of the paired ages.
The x-axis represents the age of the husband and the y-axis the age of the wife. Scatter plot showing wife's age as a function of husband's age. There are two important characteristics of the data revealed by Figure 2. First, it is clear that there is a strong relationship between the husband's age and the wife's age: When one variable Y increases with the second variable Xwe say that X and Y have a positive association. Conversely, when Y decreases as X increases, we say that they have a negative association.
Second, the points cluster along a straight line. When this occurs, the relationship is called a linear relationship. Figure 3 shows a scatter plot of Arm Strength and Grip Strength from individuals working in physically demanding jobs including electricians, construction and maintenance workers, and auto mechanics.
Not surprisingly, the stronger someone's grip, the stronger their arm tends to be. There is therefore a positive association between these variables.