This is a transcript for a video linked here: Cybersecurity STRIDE analysis of a Raspberry Pi IoT project.
00:00:00.480 --> 00:00:06.080 in my earlier video i explained the concept of threat analysis and the stride threat 00:00:06.080 --> 00:00:11.440 classification system in that video i looked at the theory but in this video i'm going 00:00:11.440 --> 00:00:16.960 to take a look at how it can be used in a real example using an internet of things iot project 00:00:18.080 --> 00:00:21.680 this is going to be simplified but could help show 00:00:21.680 --> 00:00:26.960 you how to turn the theory into practice if you haven't yet seen my earlier video then you may 00:00:26.960 --> 00:00:31.760 want to click the link at the top of this video or in the description before watching this video 00:00:33.600 --> 00:00:38.240 if you enjoy this video or it provides something useful then please click the like button at any 00:00:38.240 --> 00:00:44.560 point in this video that will help share this with others as a quick recap stride is a monomic that 00:00:44.560 --> 00:00:51.840 describes possible attack vectors against the system s is spoofing t is tampering r 00:00:51.840 --> 00:00:59.680 is repudiation i information disclosure d denial of service and e is elevation of privilege 00:01:02.000 --> 00:01:06.240 you can do the threat modeling yourself or you can use the microsoft threat modeling tool 00:01:06.880 --> 00:01:12.160 there are pros and cons to each approach one of the advantages of using a threat modeling tool 00:01:12.160 --> 00:01:18.640 is that it may identify threats you wouldn't otherwise think about one of the disadvantages 00:01:18.640 --> 00:01:23.920 is that it's not fully aware of your system so may miss some of the important threats the 00:01:23.920 --> 00:01:29.280 threat modeling tool is also only available for windows and its analysis is not so useful for 00:01:29.280 --> 00:01:35.520 applications on non-microsoft platforms in this case it's for a linux running on the raspberry pi 00:01:37.200 --> 00:01:41.760 for this project i did use the tool but i also did some manual analysis as well 00:01:43.040 --> 00:01:47.760 both of these two methods picked up some threats that would have been missed by the other 00:01:51.600 --> 00:01:55.520 the project i'm doing is for my raspberry pi pixel server project 00:01:56.800 --> 00:02:02.960 this is a open source project and it's available on github and w link 00:02:02.960 --> 00:02:09.840 to my project page in the description this started life as a non-iot project 00:02:11.440 --> 00:02:16.400 although it was designed with a web interface it was only initially designed for use on a secure 00:02:16.400 --> 00:02:21.840 network and so didn't include authentication or any of the other security features 00:02:23.840 --> 00:02:28.960 i wanted to be able to connect the project to the internet so i therefore identified this would need 00:02:28.960 --> 00:02:35.440 to include authentication and in particular the aaa security framework so i've performed 00:02:35.440 --> 00:02:41.760 a significant rewrite of the code to add the login authentication and various other aspects 00:02:43.280 --> 00:02:48.560 the threat analysis then becomes important to ensure that i'm not exposing this project or other 00:02:48.560 --> 00:02:55.440 systems to a necessary risk when it's connected to the internet before you start the threat analysis 00:02:56.080 --> 00:03:01.920 you should decide on the scope of the analysis is this just looking at a particular program or the 00:03:01.920 --> 00:03:07.680 overall system if you're restricting the scope then you may still need to consider the overlap 00:03:08.640 --> 00:03:14.560 in this case i'm concentrating on my own code but i do also consider the overall system 00:03:15.280 --> 00:03:19.600 and threats associated between how the code and the operating system are configured 00:03:23.520 --> 00:03:28.000 so to perform stride analysis you normally start with a data flow diagram such as the 00:03:28.000 --> 00:03:35.120 one shown here this identifies where data flows from one system or part of the system to another 00:03:36.000 --> 00:03:41.760 you can then look at each of the data flows and apply the six attack types and identify 00:03:41.760 --> 00:03:48.320 the threats and the risks this example i'm going to be performing analysis on the overall design 00:03:48.320 --> 00:03:53.600 and the application code itself there are more components and also be useful to look in terms 00:03:53.600 --> 00:03:59.840 of the operating system configuration as well but i won't be going into that in much detail here 00:04:00.640 --> 00:04:08.000 i'll be using two different data flow diagrams and i'll be covering a second one later this diagram 00:04:08.000 --> 00:04:14.480 shows the overall design the main part of the program is a web application written 00:04:14.480 --> 00:04:17.120 in python flask which is shown on the right here 00:04:19.360 --> 00:04:27.200 this uses http but with login information going to the application i decided early on that i wanted 00:04:27.200 --> 00:04:34.480 to use https encryption and whilst it is possible to use https on the web application itself 00:04:35.360 --> 00:04:41.120 i decided to use a web proxy instead i've explained why i made that decision 00:04:41.120 --> 00:04:45.440 on my other channel penguin tutor see the link above or in the description 00:04:48.560 --> 00:04:54.720 the central circle is labeled as a web server and this represents the reverse proxy running engine x 00:04:55.680 --> 00:05:02.960 although it's also acting as a web server as well the unencrypted traffic flows between the proxy 00:05:02.960 --> 00:05:08.800 and the web application the reverse proxy could be running on the same computer system 00:05:08.800 --> 00:05:14.240 in which case that traffic would not leave the system but if they are not on the same system 00:05:14.240 --> 00:05:19.840 then the network connection between the proxy and the web application does need to be secured 00:05:22.000 --> 00:05:28.000 on the left you can see the human user and the traffic there is encrypted using tls 00:05:29.280 --> 00:05:34.400 this red line represents the internet boundary and the traffic going to and 00:05:34.400 --> 00:05:39.680 from the user will traverse the internet and this is a significant point for analysis 00:05:43.760 --> 00:05:48.400 first i'm going to look at the http traffic between the proxy and the web application 00:05:49.520 --> 00:05:53.840 so i've already identified this is a potential security vulnerability as the data is not 00:05:53.840 --> 00:06:00.000 encrypted when using http so we already know this is going to identify some vulnerabilities 00:06:00.560 --> 00:06:03.920 my mitigation plan is to ensure that this is a protected network 00:06:04.880 --> 00:06:12.880 the network protection can be provided by securing the physical network eg using a ethernet switch 00:06:14.640 --> 00:06:21.680 wireless network security running a secure wi-fi network or avoided completely by installing 00:06:22.880 --> 00:06:30.960 the web application on the same computer as the proxy although even with these mitigations in 00:06:30.960 --> 00:06:36.800 place we may still need to look through some of the threats that are identified 00:06:40.400 --> 00:06:43.840 the first threat that the tool identified is replay attacks 00:06:45.440 --> 00:06:50.720 this is something i was already aware of it's protected when going over the internet through 00:06:50.720 --> 00:06:55.440 the use of https and it's basically down to the security of the local network for this 00:06:57.520 --> 00:07:02.320 the same for collision attacks 00:07:02.320 --> 00:07:06.240 cross-site scripting is something i've already identified as a possible risk 00:07:06.800 --> 00:07:12.160 this is where scripts could be inserted through unsanitized information i'll mention this in 00:07:12.160 --> 00:07:19.600 more details a bit later weak authentication again i'll cover this in more details later 00:07:20.480 --> 00:07:22.160 this is already something i was looking at 00:07:24.400 --> 00:07:30.000 and then it also lists elevation using impersonation link to authentication 00:07:30.000 --> 00:07:35.440 is something i had already considered both through the authentication but also considering 00:07:35.440 --> 00:07:44.320 network based authentication which is included as an option to mitigate against this then certain 00:07:44.320 --> 00:07:49.760 configuration criteria is needed both for the authentication on the application and on the proxy 00:07:53.040 --> 00:07:58.240 and then look at the main interaction with the user shown here at this internet boundary 00:07:59.680 --> 00:08:04.720 because the way this particular app works then it may be that some of the threats identified here 00:08:04.720 --> 00:08:07.440 should in fact apply to the web application instead 00:08:08.080 --> 00:08:14.640 rather than the web server which is just proxying the request onto the web application 00:08:18.560 --> 00:08:24.240 the traffic over here is over https which is encrypted there's still some threats identified 00:08:24.240 --> 00:08:31.520 especially look at the interaction with the user at this point look at data repudiation this means 00:08:31.520 --> 00:08:38.240 about being able to prove what happened this is achieved through login on both engine x which logs 00:08:38.240 --> 00:08:44.880 urls visited and the web application which logs the user interaction such as successful logins 00:08:49.440 --> 00:08:52.640 cross-site scripting doesn't really relate to this 00:08:52.640 --> 00:08:56.080 as it's a proxy i've already mentioned this for the web application now 00:08:58.960 --> 00:09:06.880 elevation using impersonation again users shouldn't be able to log into the proxy directly 00:09:07.760 --> 00:09:14.080 but this is relevant for the actual server that is running the proxy and in particular the weakness 00:09:14.080 --> 00:09:19.520 of password logins i've already created a video on adding two-factor authentication 00:09:20.240 --> 00:09:26.720 for ssh logins to linux machines which is a good way of adding additional protection 00:09:28.240 --> 00:09:34.800 the next two threats he identified are both denial of service attacks a potential process 00:09:34.800 --> 00:09:42.480 crash in the case of the web proxy then it's using nginx just well of tested application 00:09:43.920 --> 00:09:51.360 and to try and mitigate against that problem on the application that can be done through testing 00:09:53.360 --> 00:09:56.960 there are some things that can be covered through configuration 00:09:56.960 --> 00:09:59.600 it becomes a balance between different security aspects 00:10:01.360 --> 00:10:06.320 and i'll discuss that in more details later when i talk about the password hashing 00:10:08.960 --> 00:10:15.760 the data flow is much harder to protect against but one of the things for this is that this is 00:10:15.760 --> 00:10:23.360 not a critical system if it's unavailable there will not be any harm or real risk 00:10:24.560 --> 00:10:29.760 through a dos attack even if there is a denial of service attack 00:10:29.760 --> 00:10:35.040 against the proxy there's a good chance that it's still accessible to get to the web application 00:10:36.000 --> 00:10:38.560 from the local network without going through the proxy 00:10:40.240 --> 00:10:48.480 remote code execution is not really possible either on the proxy or the web application and 00:10:49.280 --> 00:10:56.320 the mitigation to avoid that is just by sanitizing the data received but it doesn't actually execute 00:10:56.320 --> 00:11:03.120 any of the data received anyway so the next one it comes up with is cross site request forgery 00:11:03.120 --> 00:11:09.440 now this is an interesting one it's something i forgot to include when i did my manual analysis 00:11:10.240 --> 00:11:15.120 although it is something i'm familiar with so it is something maybe i should have been thinking 00:11:15.120 --> 00:11:23.600 about but this is where the tool came in useful an explanation of cross site request forgery 00:11:23.600 --> 00:11:28.160 deserves its own video it i can't really go into the full details of it on here 00:11:28.880 --> 00:11:35.760 but i did some research about the flask module and there is a module available 00:11:35.760 --> 00:11:41.840 which adds protection for the logging sessions and i've enabled that in the web application 00:11:43.440 --> 00:11:51.840 another threat it identified was spoof inhuman user destination the mitigation in this is looking 00:11:51.840 --> 00:11:59.040 at the authentication mechanisms which i'll be explaining later and finally the last threat 00:11:59.040 --> 00:12:05.680 was identified here is that the human denies receiving the data as a repudiation 00:12:06.640 --> 00:12:11.840 there's no data that's really passed to the user that's particularly important on 00:12:11.840 --> 00:12:18.480 here so it's not something i'm really concerned with but i have already added login mechanisms 00:12:19.280 --> 00:12:27.680 as i've already mentioned so that would be the repudiation aspect so i said i created a second 00:12:27.680 --> 00:12:33.040 data flow diagram and this shows the interaction between the application and the local file system 00:12:35.520 --> 00:12:42.400 and there were seven through identified by the microsoft threat analysis tool for this i'm not 00:12:42.400 --> 00:12:47.040 going to list them in full there's just a few that are relevant so i'm just going to pull these out 00:12:48.800 --> 00:12:53.920 and there are three that are directly relevant and the first is about weak credential storage 00:12:54.640 --> 00:13:00.640 usernames and passwords will be stored on a local file system i'm already planned to use 00:13:00.640 --> 00:13:08.960 a suitable password hash and it also needs to be restrictions on who can access that password file 00:13:11.600 --> 00:13:18.960 second was the use of excessive resources the resources available on the raspberry pi are 00:13:18.960 --> 00:13:23.920 very limited and it wouldn't take much to overload that but it's not really a concern 00:13:23.920 --> 00:13:33.840 it's not critical system but the use of resources has influenced some of my decisions when coding 00:13:34.880 --> 00:13:42.000 and the third aspect is weak ass access control for resource 00:13:43.680 --> 00:13:52.240 again thinking about the security the password file the risk is also not just being able to 00:13:52.240 --> 00:13:57.760 access that password file but that someone with edit permissions to the application source code 00:13:57.760 --> 00:14:04.160 could also use that for an elevation of privilege and that's something i've thought about as well 00:14:07.600 --> 00:14:13.600 so after performing that initial analysis the next step is documenting and tracking these threats 00:14:14.160 --> 00:14:20.880 and providing appropriate resources to eliminate or mitigate the risk i'm going to show just a few 00:14:20.880 --> 00:14:25.840 of the things that i've implemented in my project one thing to remember is that this isn't a one-off 00:14:25.840 --> 00:14:30.480 task this is something that they can reviewed at different points in the software life cycle 00:14:32.800 --> 00:14:35.440 so i've already covered some of the ways that i've addressed some of the 00:14:35.440 --> 00:14:39.600 threats but there are four that i think is useful to cover in more detail as an 00:14:39.600 --> 00:14:43.840 example of how this threat analysis is then used to make the system more secure 00:14:46.640 --> 00:14:49.760 so the ones i'm going to look at are cross-site scripting 00:14:50.720 --> 00:14:58.640 weak authentication weak credential storage and weak access control for a resource so these are 00:14:59.360 --> 00:15:06.560 four of the threats that were identified earlier that i just think are going to give a good example 00:15:07.680 --> 00:15:11.840 of how it's influenced the code that i've created 00:15:16.160 --> 00:15:22.720 looking first at cross-site scripting this is where an attacker can send data to the server 00:15:22.720 --> 00:15:28.240 which can then be passed on to other users typically this is done by saving some data 00:15:28.240 --> 00:15:37.760 which includes say html script tags and some javascript code in that so for example a regular 00:15:37.760 --> 00:15:44.160 user they could update their real name field so instead of just showing their name it's got 00:15:44.160 --> 00:15:52.080 some javascript code hidden inside that and then when an admin views their profile then that code 00:15:52.080 --> 00:15:58.000 would be running the admins browser and that could be used to give a user elevated privilege 00:15:58.640 --> 00:16:03.840 or create a backdoor username and password or some other way of doing that 00:16:06.000 --> 00:16:08.480 the first protection for this is limiting who can ed 00:16:09.040 --> 00:16:15.920 edit the user details and in this case only administrators can change any of the details 00:16:17.760 --> 00:16:22.240 if a user is an admin already then it doesn't make sense for them to try and deliberately give 00:16:22.240 --> 00:16:28.080 someone else admin by trying to bypass that they could just use it using the normal admin means 00:16:30.560 --> 00:16:34.320 it may be that i want to give users the ability to update parts of their own 00:16:34.320 --> 00:16:40.080 profile in future so this wasn't the only protection i added so i did look at other 00:16:40.080 --> 00:16:47.840 ways as well i've taken a multi-stage approach to sanitizing the data the first is that flask 00:16:47.840 --> 00:16:53.840 itself automatically strips the html tags from forms unless it's explicitly told not to do so 00:16:55.040 --> 00:17:00.400 but then i've also added additional code in the objects that are updated to check for tags that 00:17:00.400 --> 00:17:06.640 are not allowed and this provides a good form of protection that i'm not only protecting the code 00:17:06.640 --> 00:17:11.520 as it's implemented at the moment but i'm also thinking about the future and some of 00:17:11.520 --> 00:17:18.320 the additional checks that are currently redundant may be required to reduce this threat in future 00:17:22.400 --> 00:17:24.640 also looking at weak authentications 00:17:26.560 --> 00:17:34.400 what are the authentication methods deployed i'm going to use two here and the first one is 00:17:34.400 --> 00:17:42.720 ip address ipdir is something that can be effectively whitelist ip addresses 00:17:42.720 --> 00:17:48.080 and they can have a guest like access they can access the application without having to log in 00:17:49.200 --> 00:17:53.680 this was a decision i made because i wanted to be able to continue to use this with my home 00:17:53.680 --> 00:18:02.240 automation system without needing to authenticate if you're using ip-based authentication then 00:18:02.800 --> 00:18:09.280 the user or computer can interact with the system control the leds but only if they're 00:18:09.840 --> 00:18:16.000 on an approved network so you can configure that just for a local ip address or your local ip range 00:18:17.440 --> 00:18:24.000 and if they want additional access or access out of the network then they'll need to log in 00:18:27.440 --> 00:18:31.680 then using a standard login via username and password 00:18:32.880 --> 00:18:38.320 i've already created videos on the risk of just using username and password authentication 00:18:39.120 --> 00:18:44.960 but it is considered sufficient for this system if your system is storing critical 00:18:44.960 --> 00:18:49.920 information then you may want to look at adding two-factor authentication or something similar 00:18:52.480 --> 00:18:56.800 an important consideration here is that the password should never be stored 00:18:56.800 --> 00:18:59.840 as plain text i'll be covering this next 00:19:00.880 --> 00:19:05.520 also the password should never be transmitted unsecured over a public network 00:19:06.320 --> 00:19:15.680 in this case https is used between the user and the proxy and only ever passed unencrypted over 00:19:15.680 --> 00:19:23.840 the local network i've already said about securing that that's something that needs to have been done 00:19:26.000 --> 00:19:31.200 so on weak credential storage as i've mentioned the password is only ever stored in a hash 00:19:31.200 --> 00:19:38.560 form this means it's possible to check password is valid by hashing the password that the user 00:19:38.560 --> 00:19:46.080 provides but it's not possible to work back to find the original password from the hash i've 00:19:46.080 --> 00:19:51.040 covered some of the flaws in this in another video essentially the security this password is down to 00:19:51.040 --> 00:19:57.760 the algorithm used there are different algorithms available i started by using what's considered to 00:19:57.760 --> 00:20:03.840 be one of the most secure which is argon 2. i implemented that in code but the problem 00:20:03.840 --> 00:20:10.240 was that the system would take a long time to process just a simple login not only that if the 00:20:10.240 --> 00:20:18.640 user tried again thinking it had not accepted their password then with multiple requests 00:20:19.680 --> 00:20:25.200 and this could cause the system to grind to a halt and this was particularly the case on a raspberry 00:20:25.200 --> 00:20:30.080 pi zero which is the lowest spec of the machine that this application is designed to run on 00:20:32.080 --> 00:20:36.960 this meant there's a conflict between the level security and the availability remember 00:20:36.960 --> 00:20:44.320 availability is another of the security requirements in the aaa security framework 00:20:46.000 --> 00:20:53.200 rather than increasing the system requirements so needing a more powerful processor i've added 00:20:53.200 --> 00:21:01.040 a configuration option which allows a security compromise use argon 2 for maximum security which 00:21:01.040 --> 00:21:12.400 is useful on a higher specs computer or sha256 for better performance shelf 256 not as secure but 00:21:13.520 --> 00:21:18.880 it works well on the 32-bit operating system and the low spec the raspberry pi 00:21:21.680 --> 00:21:28.720 and this brings us on to the file permissions restricting access to that file is an important 00:21:28.720 --> 00:21:35.440 step to ensuring it is as secure as possible if someone has login to the system and update access 00:21:35.440 --> 00:21:42.480 to the password file they can also grant any user admin access so it's important to restrict who can 00:21:42.480 --> 00:21:48.400 do that another check that i've included is to check for invalid characters when saving 00:21:48.400 --> 00:21:54.160 the username and passwords it shouldn't be a problem as only admin can add users 00:21:54.720 --> 00:22:00.240 but it does check that nobody is trying to trick the system by adding an extra colon which could 00:22:01.920 --> 00:22:05.120 allow you to insert say a different password in all 00:22:06.160 --> 00:22:12.480 other fields which is kind of similar to how an sql injection attack might be used against 00:22:12.480 --> 00:22:17.440 a database but thinking specifically about the file format in this case 00:22:22.720 --> 00:22:31.040 and then the fourth one is weak access control for a resource and i'm going to cover here a bit 00:22:31.040 --> 00:22:37.280 more than the threat identified by the microsoft threat analysis tool now something to be aware of 00:22:37.920 --> 00:22:43.040 here is that looking at the interaction between the operating system and users of that system 00:22:43.840 --> 00:22:48.400 and in most cases the only users would log on to an operating system for this kind of 00:22:48.400 --> 00:22:53.840 setup would be administrators in which case most of this is not really a direct risk 00:22:55.680 --> 00:22:59.840 it is still good practice though to treat it as though there were multiple users on this system 00:23:00.560 --> 00:23:04.400 and that can also help prevent problems in the event that an attacker 00:23:04.400 --> 00:23:10.800 so manages to get on the system because by setting appropriate file permissions you can prevent 00:23:11.520 --> 00:23:16.800 either information leakage or that user getting a privilege escalation 00:23:20.880 --> 00:23:24.800 it should be very obvious you need to restrict who can read the password file 00:23:25.520 --> 00:23:28.160 even though the passwords are stored as a password hash 00:23:28.960 --> 00:23:35.840 depending upon the quality of the password the users use it may be possible to crack those 00:23:37.920 --> 00:23:44.320 the next thing to consider is the security of the code this is something that should be considered 00:23:44.320 --> 00:23:50.720 for any server as if someone can edit or change the code they may be able to add say a backdoor 00:23:51.520 --> 00:23:56.080 an additional thing to be aware of in this program is it needs to run with root 00:23:56.080 --> 00:24:02.560 admin permissions this is a requirement both to be able to use the port 80 00:24:03.360 --> 00:24:12.560 but also to be able to access the neopixels that are the rgb leds that it uses 00:24:14.880 --> 00:24:21.280 and what this means is that if a regular user is able to edit those executables they 00:24:21.280 --> 00:24:30.080 could potentially gain enhanced privileges there's one idea that could be a potential 00:24:30.640 --> 00:24:35.440 way of mitigating this is that you could monitor the executable files there are some 00:24:35.440 --> 00:24:43.280 tools available such as tripwire which can be used to alert to changes in the program files often 00:24:43.280 --> 00:24:48.640 these tools will alarm when it's too late perhaps when the user has already got escalated privileges 00:24:49.520 --> 00:24:57.760 but in this example as long as you're not running flask in demo bug mode the python code will not be 00:24:57.760 --> 00:25:06.720 reloaded unless the system or at least the process is restarted so that could be a while after 00:25:06.720 --> 00:25:13.920 the users made that change so using a monitoring tool such as tripwire could help with catching any 00:25:13.920 --> 00:25:21.760 malicious code before the system is restarted and before a attacker is able to exploit that 00:25:23.920 --> 00:25:27.840 i'm going to finish this here this hasn't been a complete analysis 00:25:28.400 --> 00:25:35.200 it's only covered some of the risks and i'll put a bit more information on my website so 00:25:35.760 --> 00:25:41.680 look in the description for the link to that the amount of time needed for the threat analysis 00:25:41.680 --> 00:25:48.560 is going to depend upon how secure you need the system to be what is in scope need the 00:25:48.560 --> 00:25:56.080 operating system versus the application and the complexity of the program you're developing i've 00:25:56.080 --> 00:26:00.160 skipped some of the features and some of the analysis that has been done on this project 00:26:01.840 --> 00:26:05.920 but this video has shown some of the steps that need to be performed 00:26:07.120 --> 00:26:10.800 and shown this using a real example rather than just the theory 00:26:11.920 --> 00:26:19.120 so we started with the data flow diagrams identified where there's a potential 00:26:19.920 --> 00:26:27.520 of the data being accessed or been manipulated and then we looked at how we can apply these 00:26:28.800 --> 00:26:36.960 six threats to that data and then looked at what changes we could make in the code to mitigate 00:26:36.960 --> 00:26:42.080 against these there are some of the different models that can be used stride is just one of them 00:26:43.200 --> 00:26:48.160 but whatever method or tools you use it's helpful to have a framework which helps you to focus 00:26:48.160 --> 00:26:54.880 on identifying these potential vulnerabilities and so i hope this video has been useful 00:26:55.600 --> 00:27:02.080 if so please give it a like to help let others know i'll be creating some other cyber security 00:27:02.080 --> 00:27:08.720 videos in future so please subscribe if you're interested in future topics and let me know in 00:27:08.720 --> 00:27:14.560 the comments if there is anything particular you'd like me to cover on cyber security 00:27:16.560 --> 00:27:26.480 thanks for watching and i hope to see you again in a future video