February 11, 2016

Some ideas (part 1)

In this two posts I want to talk about two ideas I had in the last years. both on anonymity and privacy. You know, the worst idea are the toughest to go away from your mind. You should not consider seriously these ideas, since they are only an thought exercise, made after few beers in a hacker camp :)

A picture of me trying to get ideas out of my mind
The first one aim at increase the awareness on credentials, and has the benefit of allowing the analysis on the graph of accounts that we have inherited from decades of computer's usage and online activities.

The second one is simply a way for giving away less metadata to social networks like Facebook or twitter using a buffer that strips/fix/randomize the metadata, specifically the one related to behavior analysis such as circadian rhythms and other personal pattern of navigation.

Let's start with the first one. It happened to me, more than once ( and once is already too much) to realize that the choice of  how credential are hierarchically structured does not make any sense at all. Let me explain what I mean by that. Our interaction with technology, both online ad offline is getting more and more complex, and we have created a multitude of services and accounts. Some of them are interconnected by design using APIs and, that is, you can create easily an account on the website X if you have already another account on the website Y.  Some other are purely offline accounts, but upon them lays a whole linings of other applications and authentications, such as our HD and our smartphones.

In this "model" your boundary is a list of nodes close to the root of the tree.
So there's not a crisp line between me and the external world.


The idea:

It might seem a silly idea, but perhaps it because you have not realized yet how complex is the credential graph created so far, and also, you might have not realized how many times you login from not secure workstation from untrusted devices or network, nor where are inserted all the password that you repeat over and over on many different website.

For instance, I would like to be "actively aware" that my  precious linked-in, stack-exchange, OkCupid account which are based on my Facebook login are based on an email whose password I chose in middle school, and that I gave access to that email to plenty of online-and-forgotten services. Moreover, those account is logged in on my smartphone, which I leave unlocked. This is a story that might make you realize the importance of a program like this: How I Lost My $50,000 Twitter Username, It maybe could have been avoided if having a cautious and prudent handling of what I call "authentication's graph". Another reason for having crystal clear the way you structure your authentication graphs are passwords leaks. If you use your password multiple times, and one of this databases leaks some credentials, by having just your email it is possible to retrieve the list of the website you are enrolled with those email, so changing your password only in one place won't help.

So I think it is fruitful to map this network of relationship, and when you have to map relationships, there's only one thing tool that you need: graphs!

Here is what I thought to model an ecosystem of credentials: you build a graph, where every node represent a entity: something which can be accessed with some credentials. Hard disks, online accounts of every kind, a laptops, smartphones, and so on, are entities. Directed edges in this graph point a relation of the kind "can give access to", in order to capture there paradigm where you can recover the access to a connected account further away in the graph.


Now you can make basic computation on the graph:
    • The user will set for each nodes a score of "sensitivity" and a password. (The password can be scored and forgotten, saving only the score)
    • Once you have this graph you can impose two simple rules:
      1. The sensitivity of a node cannot decrease from the leaves to the root: this means that if a resource X gives access to an resource Y and Y has importance 10, the importance of X cannot be less than 10.
      2. The strength of password should grows according to the sensitivity, that is, from the leaves to the root. 
      3. Since you might have nodes with equal sensitivity on different branches, we require that nodes with equal sensitivity have equal password strength.
      4. Update: There should be no cycles. I.e. it should be at least a D.A.G if not a tree.
 The software, with a simple graph traversal algorithm would spot anomalies in the graph and suggest possible fixes (i.e. changes of password).

Concretely:

For the implementation, you have these different alternatives:
  • An online solutions is a bold idea, since many users wont trust enough the platform. The functionality can be pretty equivalent though if you don't ask for the password itself, but just the relations. Specific question on length of password and char-set used may be an alternative.
  • A Keepass plugin is another noticeable options, since you don't have to get the user data again, but it might restrict the potential user base, since keepass' universe is fragmented into various versions.
  • Having a new standalone software would solve most of the previous problem, but I think it's the most challenging options, and you need to gain the trust of the users again.
  • As browser extensions is perfect. You might have the chance to create the graph automatically, removing the effort usage that the user have to do in order to use actively the software.
I am aware the whole problem can be mitigated introducing 2FA or other authentication methods but passwords. But we might be far from that utopia, and there are situation (as we saw before) when that will not suffice.
Here I'm only describing an idea that got in my mind for a while, and eventually receive some feedback. Visualize this data graphically, might also help to handle separate identities online.

References:

[1] https://www.youtube.com/watch?v=fHZJzkvgles Dan Geer on the future of the security


Bye