springboot2.x使用Jsoup防XSS攻击的实现

作者:撸小鱼 时间:2023-11-17 06:40:47 

后端应用经常接收各种信息参数,例如评论,回复等文本内容。除了一些场景下面,可以特定接受的富文本标签和属性之外(如:b,ul,li,h1, h2, h3...),需要过滤掉危险的字符和标签,防止xss攻击。

一、什么是XSS?

看完这个,应该有一个大致的概念。

XSS攻击常识及常见的XSS攻击脚本汇总
XSS过滤速查表

二、准则

  • 永远不要相信用户的输入和请求的参数(包括文字、上传等一切内容)

  • 参考第1条

三、实现做法

结合具体业务场景,对相应内容进行过滤,这里使用Jsoup。

jsoup是一款Java的HTML解析器。Jsoup提供的Whitelist(白名单)对文本内容进行过滤,过滤掉字符、属性,但是又保留必要的富文本格式。
如,白名单中允许b标签存在(并且不允许b标签带有其他属性)那么在一段Html内容,在过滤之后,会变成:

过滤前:


<b style="xxx" onclick="<script>alert(0);</script>">abc</>

过滤后: 


<b>abc</b>

Whitelist主要方法说明

方法说明
addAttributes(String tag, String... attributes)给标签添加属性。Tag是属性名,keys对应的是一个个属性值。例如:addAttributes("a", "href", "class")表示:给标签a添加href和class属性,即允许标签a包含href和class属性。如果想给每一个标签添加一组属性,使用:all。例如:addAttributes(":all", "class").即给每个标签添加class属性。
addEnforcedAttribute(String tag, String attribute, String value)给标签添加强制性属性,如果标签已经存在了要添加的属性,则覆盖原有值。tag:标签;key:标签的键;value:标签的键对应的值。例如:addEnforcedAttribute("a", "rel", "nofollow")表示
addProtocols(String tag, String key, String...protocols)给URL属性添加协议。例如:addProtocols("a", "href", "ftp", "http", "https")标签a的href键可以指向的协议有ftp、http、https
addTags(String... tags)向Whitelist添加标签
basic()允许的标签包括: a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, strike, strong, sub, sup, u, ul,以及合适的属性。标签a指向的连接可以是 http, https, ftp, mailto,转换完后会强制添加 rel=nofollow这个属性。不允许包含图片。
basicWithImages()在basic的基础上增加了图片的标签:img以及使用src指向http或https类型的图片链接。
none()只保留文本,其他所有的html内容均被删除
preserveRelativeLinks(booleanpreserve)false(默认):不保留相对地址的url;true:保留相对地址的url
relaxed()允许的标签:a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul。结果不包含标签rel=nofollow,如果需要可以手动添加。
simpleText()只允许:b, em, i, strong, u。

四、例子

基于springboot

pom.xml依赖


<dependencies>
   <dependency>
     <groupId>org.springframework.boot</groupId>
     <artifactId>spring-boot-starter-web</artifactId>
   </dependency>

<!-- jsoup -->
   <dependency>
     <groupId>org.jsoup</groupId>
     <artifactId>jsoup</artifactId>
     <version>1.13.1</version>
   </dependency>

<dependency>
     <groupId>org.apache.commons</groupId>
     <artifactId>commons-lang3</artifactId>
   </dependency>

<dependency>
     <groupId>commons-io</groupId>
     <artifactId>commons-io</artifactId>
     <version>2.6</version>
   </dependency>

</dependencies>

HtmlFilter过滤类


import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.safety.Whitelist;

import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.List;

/**
* HtmlFilter
*
* @author 撸小鱼
* Created by lofish@foxmail.com on 2020-04-12
*/
public class HtmlFilter {

/**
  * 默认使用relaxed()
  * 允许的标签: a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul。结果不包含标签rel=nofollow ,如果需要可以手动添加。
  */
 private Whitelist whiteList;

/**
  * 配置过滤化参数,不对代码进行格式化
  */
 private Document.OutputSettings outputSettings;

private HtmlFilter() {
 }

/**
  * 静态创建HtmlFilter方法
  * @param whiteList 白名单标签
  * @param pretty 是否格式化
  * @return HtmlFilter
  */
 public static HtmlFilter create(Whitelist whiteList, boolean pretty) {
   HtmlFilter filter = new HtmlFilter();
   if (whiteList == null) {
     filter.whiteList = Whitelist.relaxed();
   }
   filter.outputSettings = new Document.OutputSettings().prettyPrint(pretty);
   return filter;
 }

/**
  * 静态创建HtmlFilter方法
  * @return HtmlFilter
  */
 public static HtmlFilter create() {
   return create(null, false);
 }

/**
  * 静态创建HtmlFilter方法
  * @param whiteList 白名单标签
  * @return HtmlFilter
  */
 public static HtmlFilter create(Whitelist whiteList) {
   return create(whiteList, false);
 }

/**
  * 静态创建HtmlFilter方法
  * @param excludeTags 例外的特定标签
  * @param includeTags 需要过滤的特定标签
  * @param pretty   是否格式化
  * @return HtmlFilter
  */
 public static HtmlFilter create( List<String> excludeTags,List<String> includeTags, boolean pretty) {
   HtmlFilter filter = create(null, pretty);
   //要过滤的标签
   if (includeTags != null && !includeTags.isEmpty()) {
     String[] tags = (String[]) includeTags.toArray(new String[0]);
     filter.whiteList.removeTags(tags);
   }
   //例外标签
   if (excludeTags != null && !excludeTags.isEmpty()) {
     String[] tags = (String[]) excludeTags.toArray(new String[0]);
     filter.whiteList.addTags(tags);
   }
   return filter;
 }

/**
  * 静态创建HtmlFilter方法
  * @param excludeTags 例外的特定标签
  * @param includeTags 需要过滤的特定标签
  * @return HtmlFilter
  */
 public static HtmlFilter create(List<String> excludeTags,List<String> includeTags) {
   return create( includeTags, excludeTags, false );
 }

/**
  * @param content 需要过滤内容
  * @return 过滤后的String
  */
 public String clean(String content) {
   return Jsoup.clean(content, "", this.whiteList, this.outputSettings);

}

public static void main(String[] args) throws FileNotFoundException, IOException {
   String text = "<a href=\"http://www.baidu.com/a\" onclick=\"alert(1);\"></a><script>alert(0);</script><b style=\"xxx\" onclick=\"<script>alert(0);</script>\">abc</>";
   System.out.println(HtmlFilter.create().clean(text));
 }
}

XssFilter过滤器


import org.apache.commons.lang3.StringUtils;

import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
* XssFilter
*
* @author 撸小鱼
* Created by lofish@foxmail.com on 2020-04-12
*/
public class XssFilter implements Filter {

/**
  * 例外urls
  */
 private List<String> excludeUrls = new ArrayList<>();

/**
  * 例外标签
  */
 private List<String> excludeTags = new ArrayList<>();

/**
  * 需要过滤标签
  */
 private List<String> includeTags = new ArrayList<>();

/**
  * 开关
  */
 public boolean enabled = false;

/**
  * 编码
  */
 private String encoding = "UTF-8";

@Override
 public void init(FilterConfig filterConfig) throws ServletException {
   String enabledStr = filterConfig.getInitParameter("enabled");
   String excludeUrlStr = filterConfig.getInitParameter("urlPatterns");
   String excludeTagStr = filterConfig.getInitParameter("excludes");
   String includeTagStr = filterConfig.getInitParameter("includes");
   String encodingStr = filterConfig.getInitParameter("encoding");

if (StringUtils.isNotEmpty(excludeUrlStr)) {
     String[] url = excludeUrlStr.split(",");
     Collections.addAll(this.excludeUrls, url);
   }

if (StringUtils.isNotEmpty(excludeTagStr)) {
     String[] url = excludeTagStr.split(",");
     Collections.addAll(this.excludeTags, url);
   }

if (StringUtils.isNotEmpty(includeTagStr)) {
     String[] url = includeTagStr.split(",");
     Collections.addAll(this.includeTags, url);
   }

if (StringUtils.isNotEmpty(enabledStr)) {
     this.enabled = Boolean.parseBoolean(enabledStr);
   }

if (StringUtils.isNotEmpty(encodingStr)) {
     this.encoding = encodingStr;
   }

}

@Override
 public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
   HttpServletRequest req = (HttpServletRequest) request;
   HttpServletResponse resp = (HttpServletResponse) response;
   if (handleExcludeUrls(req, resp)) {
     chain.doFilter(request, response);
     return;
   }

XssHttpServletRequestWrapper xssRequest = new XssHttpServletRequestWrapper((HttpServletRequest) request, encoding, excludeTags, includeTags );
   chain.doFilter(xssRequest, response);

}

private boolean handleExcludeUrls(HttpServletRequest request, HttpServletResponse response) {
   if (!enabled) {
     return true;
   }
   if (excludeUrls == null || excludeUrls.isEmpty()) {
     return false;
   }
   String url = request.getServletPath();
   for (String pattern : excludeUrls) {
     Pattern p = Pattern.compile("^" + pattern);
     Matcher m = p.matcher(url);
     if (m.find()) {
       return true;
     }
   }
   return false;
 }
}

一般情况下,我们都是通过request的parameter来传递参数。
但是,如果在某些场景下面,通过requestBody体(json等),来传递相应参数应该怎么办?
这就要需要我们对request的inputStream来进行来过滤处理了

有个地方需要注意一下的:
servlet中inputStream只能一次读取,后续不能再次读取inputStream。Xss过滤器中读取了stream之后,后续如果其他逻辑涉及到inputStream读取,会抛出异常。那我们就需要想办法把已经读取的stream,重新放回到请求中。


import org.apache.commons.io.IOUtils;
import org.apache.commons.lang3.StringUtils;

import javax.servlet.ReadListener;
import javax.servlet.ServletInputStream;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletRequestWrapper;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

/**
* XSS过滤处理
* @author 撸小鱼
* Created by lofish@foxmail.com
*/
public class XssHttpServletRequestWrapper extends HttpServletRequestWrapper{

HttpServletRequest orgRequest;

String encoding;

HtmlFilter htmlFilter;

private final static String JSON_CONTENT_TYPE = "application/json";

private final static String CONTENT_TYPE = "Content-Type";

/**
  * @param request HttpServletRequest
  * @param encoding 编码
  * @param excludeTags 例外的特定标签
  * @param includeTags 需要过滤的标签
  */
 public XssHttpServletRequestWrapper( HttpServletRequest request, String encoding, List<String> excludeTags, List<String> includeTags ){
   super( request );
   orgRequest = request;
   this.encoding = encoding;
   this.htmlFilter = HtmlFilter.create( excludeTags, includeTags );
 }

/**
  *
  * @param request HttpServletRequest
  * @param encoding 编码
  */
 public XssHttpServletRequestWrapper( HttpServletRequest request, String encoding ){
   this( request, encoding, null, null );
 }

private String xssFilter( String input ){
   return htmlFilter.clean( input );
 }

@Override
 public ServletInputStream getInputStream() throws IOException{
   // 非json处理
   if( !JSON_CONTENT_TYPE.equalsIgnoreCase( super.getHeader( CONTENT_TYPE ) ) ){
     return super.getInputStream();
   }
   InputStream in = super.getInputStream();
   String body = IOUtils.toString( in, encoding );
   IOUtils.closeQuietly( in );

//空串处理直接返回
   if( StringUtils.isBlank( body ) ){
     return super.getInputStream();
   }

// xss过滤
   body = xssFilter( body );
   return new RequestCachingInputStream( body.getBytes( encoding ) );

}

@Override
 public String getParameter( String name ){
   String value = super.getParameter( xssFilter( name ) );
   if( StringUtils.isNotBlank( value ) ){
     value = xssFilter( value );
   }
   return value;
 }

@Override
 public String[] getParameterValues( String name ){
   String[] parameters = super.getParameterValues( name );
   if( parameters == null || parameters.length == 0 ){
     return null;
   }

for( int i = 0; i < parameters.length; i++ ){
     parameters[i] = xssFilter( parameters[i] );
   }
   return parameters;
 }

@Override
 public Map<String, String[]> getParameterMap(){
   Map<String, String[]> map = new LinkedHashMap<>();
   Map<String, String[]> parameters = super.getParameterMap();
   for( String key : parameters.keySet() ){
     String[] values = parameters.get( key );
     for( int i = 0; i < values.length; i++ ){
       values[i] = xssFilter( values[i] );
     }
     map.put( key, values );
   }
   return map;
 }

@Override
 public String getHeader( String name ){
   String value = super.getHeader( xssFilter( name ) );
   if( StringUtils.isNotBlank( value ) ){
     value = xssFilter( value );
   }
   return value;
 }

/**
  * <b>
  * #获取最原始的request
  * </b>
  */
 public HttpServletRequest getOrgRequest(){
   return orgRequest;
 }

/**
  * <b>
  * #获取最原始的request
  * </b>
  * @param request HttpServletRequest
  */
 public static HttpServletRequest getOrgRequest( HttpServletRequest request ){
   if( request instanceof XssHttpServletRequestWrapper ){
     return ((XssHttpServletRequestWrapper) request).getOrgRequest();
   }
   return request;
 }

/**
  * <pre>
  * servlet中inputStream只能一次读取,后续不能再次读取inputStream
  * xss过滤body后,重新把流放入ServletInputStream中
  * </pre>
  */
 private static class RequestCachingInputStream extends ServletInputStream {
   private final ByteArrayInputStream inputStream;
   public RequestCachingInputStream(byte[] bytes) {
     inputStream = new ByteArrayInputStream(bytes);
   }

@Override
   public int read() throws IOException {
     return inputStream.read();
   }

@Override
   public boolean isFinished() {
     return inputStream.available() == 0;
   }

@Override
   public boolean isReady() {
     return true;
   }

@Override
   public void setReadListener( ReadListener readListener ){
   }
 }
}

springboot2.2.4.RELEASE中注册Filter


@Configuration
public class XssFilterConfig {

@Value("${xss.enabled:true}")
 private String enabled;

@Value("${xss.excludes:}")
 private String excludes;

@Value("${xss.includes$:}")
 private String includes;

@Value("${xss.urlPatterns:/*}")
 private String urlPatterns;

@Bean
 public FilterRegistrationBean<XssFilter> xssFilterRegistrationBean() {
   FilterRegistrationBean<XssFilter> registration = new FilterRegistrationBean<>();
   registration.setDispatcherTypes(DispatcherType.REQUEST);
   registration.setFilter(new XssFilter());
   registration.addUrlPatterns(urlPatterns.split(","));
   registration.setName("XssFilter");
   registration.setOrder(Integer.MAX_VALUE);
   Map<String, String> initParameters = new HashMap<String, String>();
   initParameters.put("excludes", excludes);
   initParameters.put("includes", excludes);
   initParameters.put("enabled", enabled);
   registration.setInitParameters(initParameters);
   return registration;
 }
}

测试

http://localhost:8080/demo/th/xss?abc=%3Ca%20href=%22http://www.baidu.com/a%22%20onclick=%22alert(1);%22%3Eabc%3C/a%3E%3Cscript%3Ealert(0);%3C/script%3E&abc=%3Cb%20style=%22xxx%22%20onclick=%22%3Cscript%3Ealert(0);%3C/script%3E%22%3Eabc%3C/%3E

springboot2.x使用Jsoup防XSS攻击的实现

来源:https://segmentfault.com/a/1190000022348583

标签:springboot,Jsoup,防XSS
0
投稿

猜你喜欢

  • C#实现winform渐变效果的方法

    2023-03-14 00:26:06
  • C# 向Word中设置/更改文本方向的方法(两种)

    2023-01-12 21:37:33
  • 如何基于LoadingCache实现Java本地缓存

    2023-04-02 00:14:55
  • Android Compose衰减动画Animatable使用详解

    2022-08-26 23:09:42
  • Flutter 底部弹窗如何实现多项选择

    2023-06-24 17:08:17
  • C#处理猜拳问题的简单实例(非窗体)

    2021-08-01 18:58:22
  • 六款值得推荐的android(安卓)开源框架简介

    2023-06-24 01:46:54
  • 解决myBatis返回integer值的问题

    2022-07-23 18:17:38
  • 深入理解以DEBUG方式线程的底层运行原理

    2022-07-12 03:19:40
  • Android DataBinding手把手入门教程

    2023-04-29 18:51:24
  • 微信支付H5调用支付详解(java版)

    2023-03-10 14:21:44
  • Android实现选项菜单子菜单

    2023-06-14 16:06:26
  • Jenkins使用Gradle编译Android项目详解

    2021-12-30 22:26:30
  • C#使用FtpWebRequest与FtpWebResponse完成FTP操作

    2021-08-24 04:58:26
  • Android 文件存储与SharedPreferences存储方式详解用法

    2021-07-22 20:11:54
  • WPF模拟实现Gitee泡泡菜单的示例代码

    2023-09-19 00:53:16
  • 详解Java泛型及其应用

    2023-09-21 22:38:32
  • DecimalFormat多种用法详解

    2022-11-13 15:06:52
  • Java多线程ThreadPoolExecutor详解

    2023-11-23 18:39:32
  • 深入了解Java核心类库--Math类

    2023-08-19 01:06:21
  • asp之家 软件编程 m.aspxhome.com