I finally got my script to work but it takes a long time to do the search (via ajax). Basically by entering a keyword, it searches the page and captures all the titles, urls, and thumbnails of the videos. But the problem arose to me to capture the tags that were inside each video, so I had to forcibly access each video to capture the tags, the only way I could think of was to add a loop inside the loop that captures the found videos that is to say:
For each video found -> Capture title, thumbnail, URL -> With captured URL -> Go to that URL and capture your tags.
The code I used is basically the following, I need to know if there is any other method to speed up searches, either by optimizing the code or using another way:
My parse function:
<?php function dlPage($href) { $curl = curl_init(); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($curl, CURLOPT_HEADER, "Accept-language: en-US"); curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true); curl_setopt($curl, CURLOPT_URL, $href); curl_setopt($curl, CURLOPT_REFERER, $href); curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4"); $str = curl_exec($curl); curl_close($curl); // Create a DOM object $dom = new simple_html_dom(); // Load HTML from a string $dom->load($str); return $dom; } ?>
My script:
$buscartag = str_replace(' ', '+', $_POST['buscartag']); $urlparse = "https://example.com/?k=".$buscartag; $paginas = rand(0, 50); $html = dlPage($urlparse."&p=".$paginas); $counter = 0; foreach($html->find('div.video-box') as $videos) { if ($videos) { $titulo = $videos->find('div.video-box>p[!class])>a[!class]',0)->attr['title']; $pathvideo = str_replace('_', '', $videos->attr['id']); $link = "https://www.example.com/".$pathvideo."/"; $thumb = $videos->find('div.thumb')->innertext //HERE MY SECOND BUCLE FOR TAGS!!! $gettags2 = array(); $html_tags = file_get_html($link); foreach ($html_tags->find('a.nu') as $gettags){ $gettags2[] = $gettags->innertext; if (!empty($titulo) && !empty($link) && !empty($idvideo) && !empty($urlimagen)){ $counter++; //here will echo all variables }}