The goal is to generate the skeleton of a website through human-directed machine learning
- Generate website from 10 steps
- Let human choose which one they like more
- Add in a GAN with OpenCV to detect websites which go towards the one the human likes more.
- Done
There are numerous challenges
- How to create dynamically updating HTML? Is this done by rendering a template? Or after human intervention there's a GET request?
There are many ways of treating this problem, which is what makes it fun
- What sort of input? Is it a language model only looking at the code?
- Is it a reinforcement learning solution?
- Can we use a GAN to generate tasks, but does this mean that we need a dataset of images of websites to train a CNN (Discriminator) on?
- Is it possible to have minimal human feedback to generate something that's in the 'area' of a website
- Is it possible to do this purely using code without any form of image recognition? Using HTML XML to generate more HTML XML, and have it judged by a human who looks at the webpage? Or do we need to (at some point) insert an image recognition system, either using OpenCV or a pre-trained CNN?
There are other areas that are similar in trying to generate websites, but most of these have not formed complete solution, or at least not an open-source one. That's why I think this is something interesting to work on, and would value your input
We can create a human-in-the-loop interaction by providing partial feedback to downsample code and then regenerate upsampled code from the downsampled one. Like this, we can continuously regenerate code and upsample it.
Scrape 10,000 websites and create a dataset for the downsampled site and the site.
Train a seq2seq generator to generate upsampled HTML from downsampled HTML.
A downsampler will convert all HTML into a basic HTML skeleton using a predefined & randomized formula
A seq2seq generator trained on upsampled HTML will then re-upsample this.
The human is given two generated samples to select one from.
The sample chosen by the human is downsampled and fed back into the NN.
When the human is happy, they select 'Done' and they finish the product.
Scrape 10,000 website code, and get a GAN to generate code like this.
Scrape the websites and process the data to be useful properly.