$30.00
Recommend
14 
 Thumb up
 Hide
89 Posts
1 , 2 , 3 , 4  Next »   | 

Android: Netrunner» Forums » General

Subject: [OCTGN] ~100k anonymized games dataset rss

Your Tags: Add tags
Popular Tags: [View All]
Konstantinos Thoukydidis
Germany
Frankfurt am Main
Hessen
flag msg tools
mbmbmbmbmb
I've now exported and compiled the latest anonymized dataset, containing ~105k matches.

https://docs.google.com/file/d/0B-gMiPlH3rBAWWF5THdqQklZcTQ/...

You know, when I started with this plugin, I never expected such popularity. In fact, when my stat gatherer told me he'd do some tweaks to make sure we could quickly grab win rates if we ever reached more than 100k stats, I laughted it off as being unrealistic.

23 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Alexander Danev
Bulgaria
Dimitrovgrad/Plovdiv
Haskovo
flag msg tools
mbmbmbmbmb
Overall - 51%-49% - very good balance...
3 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Konstantinos Thoukydidis
Germany
Frankfurt am Main
Hessen
flag msg tools
mbmbmbmbmb
Yep, note that you can see some basic calculations already at the open stats page: http://84.205.248.92/slaghund/slagview.aspx
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Michele Lupo
Italy
Lecce
Puglia
flag msg tools
mb
Woah the collective has a 61% win rate!
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
The Hound
Israel
flag msg tools
mbmbmbmbmb


Is there any way to extract my own statistics from OCTGN? I.e. how many games I've won/lost?



 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Konstantinos Thoukydidis
Germany
Frankfurt am Main
Hessen
flag msg tools
mbmbmbmbmb
The Hound wrote:


Is there any way to extract my own statistics from OCTGN? I.e. how many games I've won/lost?





No, I don't provide this info
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
The Hound
Israel
flag msg tools
mbmbmbmbmb
DbZer0 wrote:
The Hound wrote:


Is there any way to extract my own statistics from OCTGN? I.e. how many games I've won/lost?





No, I don't provide this info


Is there any way that I can figure out my own Corp number though? Maybe somewhere in my OCTGN interface, or some server query?




 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Trent Hamm
United States
Huxley
Iowa
flag msg tools
See this text? It's a gratuitous waste of GeekGold.
badge
The game itself isn't important. Spending time intellectually jousting with likeminded folks is the real reason to game.
mbmbmbmbmb
The Hound wrote:
DbZer0 wrote:
The Hound wrote:


Is there any way to extract my own statistics from OCTGN? I.e. how many games I've won/lost?





No, I don't provide this info


Is there any way that I can figure out my own Corp number though? Maybe somewhere in my OCTGN interface, or some server query?


I'm pretty sure it's anonymized. If it's not... they have some issues to work out.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Simon Gunkel
msg tools
mb
I wrote the script for the anonymization of the first sets. If Db0 has used it again, then the IDs aren´t even the same for this and the older dataset.
Basically the script takes all usernames that have ever played, sorts them alphabetically and then assigns them numbers. It then replaces the usernames with these numbers.

I´ll be able to start analyzing tomorrow, and hopefully get the next data dealer out by the weekend.

That being said - if you´ve got a game ID for a game you´ve played, you can find that game and then figure out the ID.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Konstantinos Thoukydidis
Germany
Frankfurt am Main
Hessen
flag msg tools
mbmbmbmbmb
The anonymized data sets does not include game IDs or lobby names, precicely so that people can't figure out what their number is easily.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
The Hound
Israel
flag msg tools
mbmbmbmbmb
DbZer0 wrote:
The anonymized data sets does not include game IDs or lobby names, precicely so that people can't figure out what their number is easily.



I see. Is there any function that would let me know how many wins/losses I have? Seems like basic info that any OCTGNer would want. Maybe they can add that to the pledge drive...



 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Grish Noren
msg tools
mbmb
The Hound wrote:
I see. Is there any function that would let me know how many wins/losses I have? Seems like basic info that any OCTGNer would want. Maybe they can add that to the pledge drive...


You could always keep a journal; no need to wait for a mysterious dev to have time to do this. And I mean that in the nicest way possible to all people. It'd bee easy enough to just have an excel spreadsheet you keep track of, or a notebook you tally in with totals at the bottom of each page, which are then summed with the previous page. Sure, its time on your end, but its not a lot... and you can keep notes on the games too to review and reflect.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Steven Tu
South Africa
flag msg tools
designer
mbmbmbmbmb
The Hound wrote:
DbZer0 wrote:
The anonymized data sets does not include game IDs or lobby names, precicely so that people can't figure out what their number is easily.



I see. Is there any function that would let me know how many wins/losses I have? Seems like basic info that any OCTGNer would want. Maybe they can add that to the pledge drive...





Db0 is specifically not letting people use his data to say to other people "I rock and you suck". That's his intention. I used to keep an excel. If I played much these days I'd still be using it. It's no biggie, really.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Dave Sutcliffe
United Kingdom
Manchester
flag msg tools
There's some great stats in here to be pulled out, I'm going to have so much Pivot Table joy this evening I may need tissues to clean up the mess.

Some of it is obvious - there's a direct correlation between Influence in the Corp deck and the Corp's Win %.

Some of it is less obvious - on average the better Corp identities have shorter game times than the worse Corp identities, but the games they win tend to be longer than the games they lose. Not quite worked that one out yet.

LIGHTBULB: AAH! It's because the Corp always goes first, so would tend to win on it's own turn and lose on the Runner's turn. I'll have to find a way to adjust for that. Makes sense cos Jinteki is the one that flips the equation and wins more often in fewer turns than it loses, as it wins on the Runner's turn.


There's definitely at least one blog worth of stuff here, probably two.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Bryan Goodwin
United States
West Linn
Oregon
flag msg tools
mbmbmbmbmb
It would be helpful if we could generate some kind of standard for scrubbing the OCTGN data pre-analysis. It would be simple enough to script and make available as a community resource for those wishing to analyze the data.

Here's a list of parameters that may be helpful in cleaning the data; comments and suggestions are welcome, this isn't a particularly refined list. With several contributors we should be able to get it to a usable state. The intent is to remove the furthest outliers that may skew the 'serious' data that people are drawing conclusions about the game as a whole from. Obviously we can't find them all (e.g. "no ice" decks), but even partial trimming should help where a piece of the data indicates something may not have been right in a particular game.

Consider whether to disregard (or set aside for separate analysis) records where:

* Duration = 180
* Runner deck size < Identity min
* Runner deck size >= 60 cards
* Runner inf > Identity max
* Corp Inf > Identity max
* Corp deck size < Identity min
* Corp deck size >= 60 cards
* Corp Flatlined = yes
* Runner Score < 0
* Runner Score >= 10
* Corp Score < 0
* Corp Score >= 10 (except NBN - Beale)


Anything else to add here? Could any of these accidentally take out legitimate data? The deck size one is sized at 60 to allow a fairly wide range of experiments with greater deck size, while still excluding crazy decks.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Dave Sutcliffe
United Kingdom
Manchester
flag msg tools
Miaowara_Tomokato wrote:
It would be helpful if we could generate some kind of standard for scrubbing the OCTGN data pre-analysis. It would be simple enough to script and make available as a community resource for those wishing to analyze the data.

Here's a list of parameters that may be helpful in cleaning the data; comments and suggestions are welcome, this isn't a particularly refined list. With several contributors we should be able to get it to a usable state. The intent is to remove the furthest outliers that may skew the 'serious' data that people are drawing conclusions about the game as a whole from. Obviously we can't find them all (e.g. "no ice" decks), but even partial trimming should help where a piece of the data indicates something may not have been right in a particular game.

Consider whether to disregard (or set aside for separate analysis) records where:

* Duration = 180
* Runner deck size < Identity min
* Runner deck size >= 60 cards
* Runner inf > Identity max
* Corp Inf > Identity max
* Corp deck size < Identity min
* Corp deck size >= 60 cards
* Corp Flatlined = yes
* Runner Score < 0
* Runner Score >= 10
* Corp Score < 0
* Corp Score >= 10 (except NBN - Beale)


Anything else to add here? Could any of these accidentally take out legitimate data? The deck size one is sized at 60 to allow a fairly wide range of experiments with greater deck size, while still excluding crazy decks.


I thought much the same. You can easily dump anything where influence is illegal for that identity. You can dump anything where deck size is >Identity +10, or less than Identity Min.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Dave Sutcliffe
United Kingdom
Manchester
flag msg tools
Wow. I'm crunching numbers and there are some staggering stats in here.

Only 1 Corp Identity has a 50% chance of winning without flatlining the runner. All other identities are odds-against winning on Agenda points.

That's pretty sobering news.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Grish Noren
msg tools
mbmb
Magicdave wrote:
Wow. I'm crunching numbers and there are some staggering stats in here.

Only 1 Corp Identity has a 50% chance of winning without flatlining the runner. All other identities are odds-against winning on Agenda points.

That's pretty sobering news.


Not surprising though; SE is a primary win condition for Wayland/NBN. Jinteki is clearly a net damage winner with an SE splash.

If I'm surprised about anything, it's that HAAS effectively pulls off a flatline for 50% of their wins. Sure it has braindamage, but it's got to be one of the hardest ways to win for that corp.

That said, I think that's bad news on some level. Corps ought to be able to compete in the agenda space. This also confirms that running damage answers might be the most vital thing a runner can do.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Gregory Pettigrew
United States
Massachusetts
flag msg tools
mbmbmb
Magicdave wrote:
Wow. I'm crunching numbers and there are some staggering stats in here.

Only 1 Corp Identity has a 50% chance of winning without flatlining the runner. All other identities are odds-against winning on Agenda points.


BWBI?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Alex Rockwell
United States
Hillsboro
Oregon
flag msg tools
designer
mbmbmbmbmb
I think we should dump all games where at least one side used 0 influence, since these indicate core deck (unmodified) play.

This would help improve the win rate stats.
3 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Dave Sutcliffe
United Kingdom
Manchester
flag msg tools
etherial wrote:
Magicdave wrote:
Wow. I'm crunching numbers and there are some staggering stats in here.

Only 1 Corp Identity has a 50% chance of winning without flatlining the runner. All other identities are odds-against winning on Agenda points.


BWBI?


HB:EtF
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Dave Sutcliffe
United Kingdom
Manchester
flag msg tools
Alexfrog wrote:
I think we should dump all games where at least one side used 0 influence, since these indicate core deck (unmodified) play.

This would help improve the win rate stats.


I can do that easily enough. Do you think it's a significant number, though?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Alex Rockwell
United States
Hillsboro
Oregon
flag msg tools
designer
mbmbmbmbmb
Magicdave wrote:


I can do that easily enough. Do you think it's a significant number, though?


It appears to be thousands of games.
Maybe we should throw out games that used low amounts of influence as well, like less than 9 or something?


I also wonder what result we would get if we threw out all games where the runner deck size was not exactly 40 or 45 cards.

We could also think about throwing out corp games without certain numbers of cards, but for corp there are some reasons you might play different numbers.


Theoretically, decks with specific card numbers might be more tuned. 45 for runner, or 40 (Chaos Theory). 49 for corp, or 44 (NBN TWIY). Maybe include 45 and 40 as well. or look at 40 and 45 as separate data.


What are the win rates if we look at only games where:

Both players used all or almost all their influence.
The runner was 45 or 40 cards.
The corp was 49 or 44 cards.


These games might reveal some difference in stats if both players are using more tuned decklists.
2 
 Thumb up
0.01
 tip
 Hide
  • [+] Dice rolls
Adam Perry
United States
Alabama
flag msg tools
mb
Alexfrog wrote:
Magicdave wrote:


I can do that easily enough. Do you think it's a significant number, though?


It appears to be thousands of games.
Maybe we should throw out games that used low amounts of influence as well, like less than 9 or something?


I also wonder what result we would get if we threw out all games where the runner deck size was not exactly 40 or 45 cards.

We could also think about throwing out corp games without certain numbers of cards, but for corp there are some reasons you might play different numbers.


Theoretically, decks with specific card numbers might be more tuned. 45 for runner, or 40 (Chaos Theory). 49 for corp, or 44 (NBN TWIY). Maybe include 45 and 40 as well. or look at 40 and 45 as separate data.


What are the win rates if we look at only games where:

Both players used all or almost all their influence.
The runner was 45 or 40 cards.
The corp was 49 or 44 cards.


These games might reveal some difference in stats if both players are using more tuned decklists.


i think the only thing you should throw out is 0 influence decklists. playing with extra runner cards or fewer corp cards definitely doesn't connotate the same skill (or lack thereof) as running with 0 influence.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Daniel D
msg tools
mbmbmbmbmb
I can agree on fewer corp cards, but I've yet to see a runner deck that couldn't be made stronger by bringing it down to the minimum deck size. If the corp is running 40/45 they could be playing a rush deck. Arguably they should just be using 40 with TWIY* but NEXT offers a not-terrible rush plan.

EDIT: Assuming the plan is to cut out a lot of the untuned decks.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
1 , 2 , 3 , 4  Next »   | 
Front Page | Welcome | Contact | Privacy Policy | Terms of Service | Advertise | Support BGG | Feeds RSS
Geekdo, BoardGameGeek, the Geekdo logo, and the BoardGameGeek logo are trademarks of BoardGameGeek, LLC.