There are currently (as of 2016-04-15) 6 datasets with Tattersalls sales data, these are:
Name | Sales included | Dim |
---|---|---|
tatts_2010 | feb, craven, guineas, july, oct_bk1, oct_bk2, oct_bk3, autumn_hit, breeders_flat, dec_yearlings, dec_foals, dec_mares | 7796 rows, 12 cols |
tatts_2011 | feb, craven, breeze_up, july, oct_bk1, oct_bk2, oct_bk3, autumn_hit, breeders_flat, dec_yearlings, dec_foals, dec_mares | 6946 rows, 12 cols |
tatts_2012 | dec_mares, dec_foals, dec_yearlings, autumn_hit, oct_bk3, oct_bk2, oct_bk1, july, guineas_hit, guineas, craven, feb | 7101 rows, 12 cols |
tatts_2013 | feb, craven, guineas, guineas_hit, july, oct_bk1, oct_bk2, oct_bk3, oct_bk4, autumn_hit, dec_yearlings, dec_foals, dec_mares | 7192 rows, 12 cols |
tatts_2014 | july, guineas, guineas_hit, craven, feb, oct_bk1, oct_bk2, oct_bk3, oct_bk4, autumn_hit, dec_yearlings, dec_foals, dec_mares | 7429 rows, 12 cols |
tatts_2015 | feb, craven, guineas, guineas_hit, july, oct_bk1, oct_bk2, oct_bk3, oct_bk4, autumn_hit | 5389 rows, 12 cols |
There is a function to collect data from sales not included in the package, see here.
To load a dataset:
data(tatts_2010)
Each of the 5 datasets have the same 12 variables, year, sale, sale_name, lot_no, horse, sex, color, sire, dam, seller, buyer, price
, as can be seen in the table above, the same sales take place each year.
It is easy to combine the 5 datasets, such as:
data(tatts_2010, tatts_2011, tatts_2012, tatts_2013, tatts_2014)
# combine all 5 datasets into one large dataset
tattersalls <- rbind(tatts_2010, tatts_2011, tatts_2012, tatts_2013, tatts_2014)
# clean workspace
rm(tatts_2010, tatts_2011, tatts_2012, tatts_2013, tatts_2014)
The new tattersalls
dataset has 36464 rows and 12 columns, the sale_name
variable can serve as an individual sale (so it includes the year of the sale, eg. feb_2010), while the sale
variable will match across multiple years.
Loading other libraries dplyr
and ggplot2
we can quickly analyse data:
library(dplyr)
library(ggplot2)
The average price of lots in each sale across multiple years:
sale_summaries <- tattersalls %>%
group_by(sale, year) %>%
summarise(n = n(),
mean = mean(price, na.rm=T))
ggplot(sale_summaries, aes(x=year, y=mean)) +
geom_bar(stat="identity", fill="#D9220F") +
theme_bw() +
theme(text = element_text(size=10)) +
facet_wrap(~sale)
One thing stands out immediately from the plots above, don’t go to October Book 1 unless you have deep pockets, the average price paid has more than doubled from 96,001 Guineas in 2010 to 212,637 Guineas in 2014. October Book 2 has also (unsurprisingly) seen a small increase in average price paid over the same period. The Craven sale has also seen a small increase over recent years, but prices are in other sales appear to be quite consistent.
We can also find the 20 sires with the highest average price over the 5 years:
tattersalls %>%
group_by(sire) %>%
summarise(n = n(),
mean = mean(price, na.rm=T),
sd = sd(price, na.rm=T),
min = min(price, na.rm=T),
max = max(price, na.rm=T)) %>%
filter(n >= 50) %>%
arrange(desc(mean)) %>%
head(20)
## Source: local data frame [20 x 6]
##
## sire n mean sd min max
## (chr) (int) (dbl) (dbl) (dbl) (dbl)
## 1 galileo (ire) 572 211314.35 409799.91 800 5000000
## 2 sea the stars (ire) 182 199763.89 150603.40 6000 650000
## 3 danehill (usa) 52 124195.56 379083.76 2000 2400000
## 4 new approach (ire) 238 108774.87 110347.56 2500 600000
## 5 montjeu (ire) 326 105223.87 156474.57 800 850000
## 6 dubawi (ire) 372 97881.49 169775.01 800 1600000
## 7 sadler's wells (usa) 156 96807.09 201949.61 1000 1700000
## 8 oasis dream (gb) 594 94710.82 143449.05 800 1100000
## 9 monsun (ger) 84 93459.70 109112.35 2000 600000
## 10 dansili (gb) 434 84944.14 184443.84 800 2700000
## 11 fastnet rock (aus) 154 83655.46 84648.97 4000 540000
## 12 raven's pass (usa) 140 81896.75 104884.44 800 800000
## 13 danehill dancer (ire) 499 80129.26 227178.69 800 4000000
## 14 invincible spirit (ire) 653 80056.97 128397.23 800 2100000
## 15 lope de vega (ire) 83 75378.38 98854.63 1000 650000
## 16 pivotal (gb) 630 72477.19 225981.29 800 4700000
## 17 lawman (fr) 281 70083.76 301039.09 800 4500000
## 18 shamardal (usa) 478 69765.14 118171.36 800 1700000
## 19 poet's voice (gb) 105 69760.64 104033.18 3500 700000
## 20 rip van winkle (ire) 130 68780.37 65442.26 4000 400000
Sea The Stars is top of the list, his progeny costing a cool 224,100 Guineas on average, Galileo is a very close second, with a large drop down, of almost 100,000 Guineas, to Danehill. Perhaps more interesting than Sea The Stars topping Galileo is the standard deviation of their prices, Sea The Stars with a much smaller sd perhaps suggests buyers believe his progeny will prove to be successful (see Taghrooda), but are still showing some restraint and aren’t quite willing to part with the big sums seen with Galileo, whose most expensive sale fetched 9 times the most expensive Sea The Stars sale.
It’s also possible to look at the buyers or agents with the highest average price paid over the 5years.
tattersalls %>%
group_by(buyer) %>%
summarise(n = n(),
mean = mean(price, na.rm=T),
sd = sd(price, na.rm=T),
min = min(price, na.rm=T),
max = max(price, na.rm=T)) %>%
filter(n >= 50) %>%
arrange(desc(mean)) %>%
head(20)
## Source: local data frame [20 x 6]
##
## buyer n mean sd min max
## (chr) (int) (dbl) (dbl) (dbl) (dbl)
## 1 demi o'byrne 51 342019.61 264811.14 6000 1300000
## 2 john ferguson bloodstock 314 215003.18 296659.68 25000 4000000
## 3 j warren bloodstock 86 176343.02 146367.91 16500 680000
## 4 shadwell estate company 234 173025.64 135849.13 35000 1100000
## 5 c gordon-watson bloodstock 281 149810.68 196409.25 800 1700000
## 6 david redvers bloodstock 202 121936.63 223385.67 5000 2500000
## 7 john warren bloodstock 93 118645.16 95735.10 11000 600000
## 8 badgers bloodstock 53 118452.83 159747.10 10000 800000
## 9 dwayne woods 59 95652.54 79130.74 3000 450000
## 10 mrs a skiffington 63 81595.24 80855.60 3000 400000
## 11 blandford bloodstock 452 80300.00 103073.80 800 925000
## 12 jeremy brummitt 67 78544.78 85540.03 6500 475000
## 13 anthony stroud bloodstock 242 75130.17 74083.30 800 425000
## 14 richard frisby bloodstock 90 72338.89 83634.50 2000 370000
## 15 brian grassick bloodstock 52 70176.92 61604.80 2200 360000
## 16 peter & ross doyle bloodstock 382 68610.47 76552.10 800 800000
## 17 hugo merry bloodstock 141 68357.45 92076.88 1700 675000
## 18 stephen hillen bloodstock 171 67578.95 167784.40 3000 2100000
## 19 sackvilledonald 191 66107.33 58515.52 2500 460000
## 20 kern/lillingston association 139 57712.23 70722.67 3800 460000
Demi O’Byrne (an agent associated with Coolmore) isn’t as active as others in the top 5, with just 50 purchases, but the average price he pays is almost double that of John Ferguson (links with Godolphin).