#E95E0D
#002642

The "insights" of SAC Smart

Wouldn't it be great to have a tool ready to perform our hard correlations searches and come up with the clearest insights available? Well, SAP states that the Smart Discovery add on of SAP Analytics Cloud (SAC) is such an innovation. But is it really? Are those analyses of high quality? I was happy to put that to the test.

Driving your business with passion

Name Jersey Number Value Overall Club
M. Etxeberria 10 0 74 No Club
I. Kovacs 10 0 73 No Club
S. Nakamura 10 0 72 Jubilo Iwata
J. Campos 10 0 71 No Club
B. Nivet 10 0 71 ESTAC Troyes
A. De Jong 10 0 59 No Club
B. Singh 10 0 58 No Club
R. Cretaro 10 40.000 57 Sligo Rovers
Ryan Yong Gi 10 50.000 58 Vegalta Sendai
K. Brennan 10 60.000 60 St. Patrick’s Athletic

From this list the names don't sound (at least not to me), but they all play with Jersey Number 10. So it's clear that the Jersey Number has no influence on your value as a player. However, playing for an important club in the Top 10 does allow the Jersey Number to influence your value. Or is it the other way around? Does the high value player choose his Jersey Number? Then it is only the personal influence of the player, less than the influence of the Jersey Number, that actually influences the value of the player. Playing with Jersey Number 10 for a club such as VVV Venlo is not equal to a value of 118,500,000 as Neymar Jr has.

And making a list based on the calculated correlations with the Spearman method (the dataset has large outliers, hence the Spearman method), shows that there is almost no correlation (SAP's association) between a player's value and his Jersey number:

Overall 0.9163082
Wage 0.7839799
Reactions 0.7507812
Potential 0.7455360
Ball Control 0.7375126
Composure 0.7012314
 
Jersey Number – 0.1779670

With this, Jersey Numbers are often also linked to the position on the field, so suggesting that there is a relationship between Jersey Number and value is rather odd. Although players in the football industry have recently been increasingly exposed to branding and merchandise, they prefer to keep their Jersey Numbers the same even when they change clubs. So a player's personal influence over his Jersey Number is only increasing, but only to the point where the CLUB actually thinks the player is valuable enough to obtain his favourite Jersey Number.

Based on SAC Smart Discovery's graph, there are many more things I could analyse to prove the "accuracy" of the graph provided. Although with this start I think I made an honest point by not always believing what you see at first sight...

Conclusion

To give yourself some basic insights, the SAC Smart Discovery can be useful, although I would recommend that you don't just follow it. Smart Discovery is not human and no matter what data you drag into this part of the tool, it behaves as normal and what it actually does in the background remains a black box. A measure is a measure, a dimension a dimension and that's it. Make sure you know your data before you simply 'accept' what SAC Smart Discovery brings you back. As you can see, it's not always what it seems!

I must point out that the graphs produced in SAC are not incorrect, although they do not add much 'added value' to the analysis. The diagrams are simple, although the titles can be very misleading (look at the diagrams showing the association between Jersey number and value...). To create the "perfect SAP selling speech Smart Discovery outcome", the data must be set up in a very specific way (as SAP did for their code jams to demonstrate the usefulness of the Smart Discovery). Although the data you use differs in many cases and therefore makes the Smart Discovery less useful as suggested.

However, if we disagree, I would like to invite you to exchange views and discuss it in greater depth together. Also if you would like to receive the R analysis to substantiate this document, please do not hesitate to contact me at d.ambaum@jugo.nl

The "insights" of SAC Smart Discovery unraveled!

Wouldn't it be great to have a tool ready to carry out our hard searches for correlations and come up with the clearest insights available? Well, SAP states that the Smart Discovery add on from SAP Analytics Cloud (SAC) is such an innovation. But is it really? Does it really come with the analysis results that no one would come up with? And are those analyses really of high quality? I tested that with great pleasure. I have invested a lot of time in analysing aspects in R, and this is what I will use to unravel the "Smart Discovery" part (well, to my ability and opinion it is).

The basis:

The analysis I have set up is to see what actually affects the value of a footballer. The main purpose of starting the analysis was the urge to predict the value of a player (i.e. the value of a player). To do this I used the FIFA 2019 dataset, where I predicted values based on a created trained part of the set consisting of 60% of the actual set. Although predicting the correct value is/was much more difficult than I thought and was only correct in 1286 out of a total of 7372 predictions, which is only a small 18% of all cases (this is logical, of course, because there are so many factors influencing the value of a player that are not included in this dataset and it is quite limited...).

 

Still, it made me curious and I wanted to find out WHAT influenced the value of a player and find the relationships. An analysis that I hoped SAC Smart Discovery would give me some extra results than those I had already created over time...

The Smart Discovery outcome:

To run the Smart Discovery or SAP Analytics Cloud (SAC), select a size or dimension that you "want to know more about" as indicated by SAP. Using the value of a player as the "I want to know more about" setting, one 1 page has been created (in SAP's "sales pitch" it's actually 4...) and it looks like this from the image below. My comments are already included in the orange.

So there are actually some really interesting and honest aspects in the Smart Discovery result above:

 

- Summary information: the total value does not say much, although the maximum value is 118,500,000.00, which is actually Neymar Jr because of his transfer to Paris Saint-German and the minus value (0) shows the wide range of values.

- Graph top right: Of the 18,206 values, 16,677 have a player value of 0, which means that they can be retrieved free of charge. Could be interesting if this is what you are aiming for.

 

- Graph below left: a correlation (association in the words of SAP) between the total salary paid and the "most valuable clubs". There is a fair relationship because more valuable players receive a higher salary (=wage).

 

But as my short analysis in the picture above shows, there are 2 graphs that really make my eyebrows frown: "Value by Preferred Foot" and "Association between Jersey Number and Value by Club". Let's dive into both and see if that graph really contains the right data...

Does "Value by Preferred Foot" really provide the right information?

Apparently it depends on your favourite foot (i.e. whether you have a left or right foot) whether you have a high value as a player. I would think that "both" would be the most valuable, although the SAC states that a player is more valuable if he is right footed. But is this really the case? Or is this chart nothing more than a misleading picture?

The graph insinuates that when you are a "righty", your value as a player is much higher. Just looking at the graph and using a logical way of looking at it, I conclude that the presented value is the SUM of values. Not the right way to present an analysis of the value by Preferred Foot if you ask me. It is common knowledge: there are more right-footed players than left-footed players. Let's use the analysis in RStudio using R (which is my backup for this whole document) to show if this is correct:

  • Left: 4201
  • Right: 13894
  • Unassigned: 111
  •  

In this case, therefore, it would be logical to calculate an average of the player's value on the basis of his or her preferred foot. And this results in the following:

Preferred Foot Average Value Difference L vs R
Left 2.591.279 + 220.213
Right 2.371.066  
Unassigned 153.243  

So actually you can say that on average you are more valuable as a left-handed player, because the numbers above don't lie...

Note that the graph in SAC emphasizes the fact that "Position ST" affects the graph the most. For your information: ST stands for striker. So let's see if this is really just a logical aspect or if Smart Discovery has actually come up with an interesting point.

In total there are 27 positions, ranging from goalkeeper to striker, covering all possible positions on the pitch. Of all those positions, the ST position has taken most players (to know 2145). And if you look at the top 10 of the most valuable players on the ST position, the names of those players make it even clearer why they have such an influence on the chart:

Name Value Preferred Foot Position
H. Kane 83.500.000 Right ST
Cristiano Ronaldo 77.000.000 Right ST
R. Lewandowski 77.000.000 Right ST
S. Aguero 64.500.000 Right ST
M. Icardi 64.500.000 Right ST
R. Lukaku 62.500.000 Left ST
G. Bale 60.000.000 Left ST
C. Immobile 52.000.000 Right ST
A. Lacazette 45.000.000 Right ST
M. Depay 42.000.000 Right ST

So SAC insinuates that ST has a great influence on the graph, and they are not incorrect. The "only" thing they have missed is that there are more Strikers than, for example, Right Attacking Midfielders (RAM). This is why they have the greatest influence on the value, including the valuable players included in the list of attackers. The real position with the highest value is not the Striker as shown in the graph below created in RStudio using ggplot, but it is the LF position (Left Forward) including only 15 players, including players such as Hazard, Dybala and Iniesta.

Due to the low number of players, it is not seen as a major influence on the value, but looking at the average value, this is a position that should be taken into account. And also in this case: the total list of LF players contains only 3 left-handed players...

It has to be said, though: SAP's graph is correct. The total SUM of the value for right-footed players is much larger, but this is clear because there are more right-footed players. Strikers' influence is also greatest, but this is also logical, because this is the position most players are in. But the question is: does the graph bring you insights? Or is it the analysis we just did?

Let's dip into the more complex graph to analyse: the correlation between jumper number and value...

Is the association really present (and correct) between Jersey Number, Club and Value?

The analysis on a so-called "relationship" (or association as SAP calls it) between Jersey number, Club and the value is somewhat more difficult. When I first saw this graph, I was like this: "hell no, you ain't no more valuable when you're playing with number 10."

The first thing that really surprises me is that the Smart Discovery actually summed up the Jersey Numbers. This means that when a Club has all players with Jersey Numbers over 30 instead of number 2, they will rise on the x-axis. So we are not even looking at an association between Jersey Number and Value, if I may put it that way. But, without taking this first strange aspect into account, let's do an analysis to see if there is some kind of truth in this graph. I will only take into account the title of the graph, then the data point. So I'm going to look for the association between Jersey Number and Value per Club.

 

Apparently Real Madrid is the most valuable club because it is the highest point on the y-axis. To check this, I have covered the top 10 most valuable clubs in RStudio using the ggplot R package, generating the following chart:

The value of a club is calculated based on the sum of all player values. The graph in R is therefore equal to the graph on the y-axis of the SAC card. The higher a club is on the y-axis, the more valuable it is.

To know if a player is more valuable when wearing a certain number, it is good to look at the top 4 players (Neymar Jr, De Bruyne, Messi and Hazard) shown in the chart below. The marked players are the players with a value above 90,000,000, and therefore the 4 most valuable players.

Okay, knowing the names of the top 4 players, we can take a look at the Jersey Numbers.
Name Jersey Nr Value Overall Club
Neymar Jr 10 118.500.000 92 Paris Saint Germain
L. Messi 10 110.500.000 94 FC Barcelona
K. De Bruyne 7 102.000.000 91 Manchester City
E. Hazard 10   93.000.000 91 Chelsea

If you look at the list like that, it seems that if you play with Jersey Number 10, you are very valuable. Or good: the most valuable players usually turn out to be playing with number 10. Depending on how you formulate this aspect to see the relationship... But does that mean that all the players who play with number 10 are equally valuable? No, of course it doesn't. This list of the bottom 10 players with Jersey number 10 confirms it:

Name Jersey Number Value Overall Club
M. Etxeberria 10 0 74 No Club
I. Kovacs 10 0 73 No Club
S. Nakamura 10 0 72 Jubilo Iwata
J. Campos 10 0 71 No Club
B. Nivet 10 0 71 ESTAC Troyes
A. De Jong 10 0 59 No Club
B. Singh 10 0 58 No Club
R. Cretaro 10 40.000 57 Sligo Rovers
Ryan Yong Gi 10 50.000 58 Vegalta Sendai
K. Brennan 10 60.000 60 St. Patrick’s Athletic

From this list the names don't sound (at least not to me), but they all play with Jersey Number 10. So it's clear that the Jersey Number has no influence on your value as a player. However, playing for an important club in the Top 10 does allow the Jersey Number to influence your value. Or is it the other way around? Does the high value player choose his Jersey Number? Then it is only the personal influence of the player, less than the influence of the Jersey Number, that actually influences the value of the player. Playing with Jersey Number 10 for a club such as VVV Venlo is not equal to a value of 118,500,000 as Neymar Jr has.

And making a list based on the calculated correlations with the Spearman method (the dataset has large outliers, hence the Spearman method), shows that there is almost no correlation (SAP's association) between a player's value and his Jersey number:

Overall 0.9163082
Wage 0.7839799
Reactions 0.7507812
Potential 0.7455360
Ball Control 0.7375126
Composure 0.7012314
 
Jersey Number – 0.1779670

With this, Jersey Numbers are often also linked to the position on the field, so suggesting that there is a relationship between Jersey Number and value is rather odd. Although players in the football industry have recently been increasingly exposed to branding and merchandise, they prefer to keep their Jersey Numbers the same even when they change clubs. So a player's personal influence over his Jersey Number is only increasing, but only to the point where the CLUB actually thinks the player is valuable enough to obtain his favourite Jersey Number.

Based on SAC Smart Discovery's graph, there are many more things I could analyse to prove the "accuracy" of the graph provided. Although with this start I think I made an honest point by not always believing what you see at first sight...

Conclusion

To give yourself some basic insights, the SAC Smart Discovery can be useful, although I would recommend that you don't just follow it. Smart Discovery is not human and no matter what data you drag into this part of the tool, it behaves as normal and what it actually does in the background remains a black box. A measure is a measure, a dimension a dimension and that's it. Make sure you know your data before you simply 'accept' what SAC Smart Discovery brings you back. As you can see, it's not always what it seems!

I must point out that the graphs produced in SAC are not incorrect, although they do not add much 'added value' to the analysis. The diagrams are simple, although the titles can be very misleading (look at the diagrams showing the association between Jersey number and value...). To create the "perfect SAP selling speech Smart Discovery outcome", the data must be set up in a very specific way (as SAP did for their code jams to demonstrate the usefulness of the Smart Discovery). Although the data you use differs in many cases and therefore makes the Smart Discovery less useful as suggested.

However, if we disagree, I would like to invite you to exchange views and discuss it in greater depth together. Also if you would like to receive the R analysis to substantiate this document, please do not hesitate to contact me at d.ambaum@jugo.nl

Stay up to date
SAP HANA CV’s als DataSource voor een SAP BW Transformation
Hybride oplossingen waarbij het beste uit twee werelden kan worden gehaald, zijn op verschillende gebieden toepasbaar: de robuustheid van SAP BW en de agility van SAP HANA. Een mooi voorbeeld van het samengaan van beide werelden is de vastlegging van data in een BW ADSO in combinatie met een HANA Calculation View die de ETL logic voor zijn rekening neemt. Allemaal niets nieuws onder de zon en ook zeker geen rocket science. Toch ben ik er erg enthousiast over en verbaas me dat er niet veel meer over wordt geschreven. Ook bij de collega’s in het netwerk zit dit nog maar beperkt in de gereedschapskist.
Theo is al 15 jaar onze Theo
Theo is 15 jaar bij JUGO in dienst. Naast dat het wel een feestje verdient, zijn we benieuwd naar de persoon achter de consultant. Wat drijft hem? Hoe hou je 15 jaar passie en plezier in je werk? En wie is Theo naast JUGO? We hebben het hem allemaal gevraagd! Lees hieronder de blog met Theo in de hoofdrol.
The journey from Philips to IBCS
In 2011 zijn we begonnen met een meerjarig initiatief voor bedrijfstransformatie, genaamdAccelerate! om ons nieuwe bedrijfsmodel te implementeren, waarbij we ernaar streefden om door middel van standaardisatie veel efficiënter te werken. We merkten dat iedereen in het bedrijf een eigen versie van de waarheid had, door middel van eigen grafieken en tabellen, in hun "eigen" Excel-bestanden.
EN

Driving your business with passion

Inschrijven training

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed semper fringilla urna vitae tincidunt. Mauris at turpis sed lacus sollicitudin iaculis ac et libero. Morbi at condimentum purus. Donec in ante nunc. Duis sapien dolor, semper tincidunt ullamcorper pharetra, fermentum non magna. Nulla a erat iaculis, venenatis augue in, ultricies enim. In molestie gravida enim non convallis. Ut nibh mi,

aliquet sodales tempor eu, eleifend a augue. Mauris id tortor nibh. Sed iaculis erat porta viverra lacinia. Mauris ut tempor lorem. Donec iaculis sem mauris, at hendrerit velit mattis quis. Donec pretium lacus non turpis rutrum, et rhoncus est lobortis. Maecenas euismod sagittis convallis. In lobortis risus tincidunt, cursus ante eget, mollis m

Inschrijven training: De “inzichten” van SAC Smart