I am using CSSBox for extracting info about size/positions of html elements on websites. For that purpose, I am using BoxFactory class but it is very slow. It takes around 2-3 second to generate Viewport for pages that does not have so much content. For example:
https://www.pracuj.pl/praca/senior-data-specialist-krakow,oferta,6303360 - takes around 4-5 seconds to generate Viewport.
String url = "https://www.pracuj.pl/praca/senior-data-specialist-krakow,oferta,6303360";
Instant start = Instant.now();
DocumentSource docSource = new DefaultDocumentSource(url);
DOMSource parser = new DefaultDOMSource(docSource);
Document doc = parser.parse();
Instant finish = Instant.now();
System.out.println("Parser " + (Duration.between(finish, start).toMillis()));
start = Instant.now();
DOMAnalyzer da = new DOMAnalyzer(doc, docSource.getURL());
da.attributesToStyles();
da.addStyleSheet((URL) null, CSSNorm.stdStyleSheet(), DOMAnalyzer.Origin.AGENT);
da.addStyleSheet((URL) null, CSSNorm.userStyleSheet(), DOMAnalyzer.Origin.AGENT);
da.getStyleSheets();
finish = Instant.now();
System.out.println("Styles " + (Duration.between(finish, start).toMillis()));
start = Instant.now();
BoxFactory boxFactory = new BoxFactory(da, docSource.getURL());
BrowserConfig config = new BrowserConfig();
config.setLoadBackgroundImages(false);
config.setLoadImages(false);
boxFactory.setConfig(config);
boxFactory.reset();
finish = Instant.now();
System.out.println("Factory init " + (Duration.between(finish, start).toMillis()));
VisualContext ctx = new VisualContext(null, boxFactory);
start = Instant.now();
Viewport viewport = boxFactory.createViewportTree(da.getRoot(), new BufferedImage(1000, 600, 1).createGraphics(), ctx, 1000, 600);
finish = Instant.now();
System.out.println("Factory run " + (Duration.between(finish, start).toMillis()));
viewport.initSubtree();
viewport.doLayout(600, true, true);
}`
Thanks.