关于postgresql:用group by,2 have和where子句加入4个表

Join 4 tables with group by, 2 having and where clause

我的数据库由4个表组成:

  • 用户(ID,"姓名",姓氏,出生日期)
  • 友谊(userid1,userid2,"时间戳")
  • posts(id,userid,"text","timestamp")。
  • 喜欢(postid,userid,timestamp)
  • 我需要在2018年1月内获得一组拥有3个以上友谊的独特用户名的结果集,他们的"喜欢"平均每个"帖子"在[10;35]范围内。

    我在第一步写了这个声明:

    1
    2
    3
    4
    5
    6
    SELECT  DISTINCT u."name"
    FROM users u
    JOIN friendships f ON u.id = f.userid1
    WHERE f."timestamp" BETWEEN '2018-01-01'::TIMESTAMP AND '2018-01-31'::TIMESTAMP
    GROUP BY u.id
    HAVING COUNT(f.userid1) > 3;

    工作正常,返回3行。但当我用这种方式添加第二部分时:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    SELECT  DISTINCT u."name"
    FROM users u
    JOIN friendships f ON u.id = f.userid1
    JOIN posts p ON p.userid = u.id
    JOIN likes l ON p.id = l.postid
    WHERE f."timestamp" BETWEEN '2018-01-01'::TIMESTAMP AND '2018-01-31'::TIMESTAMP
    GROUP BY u.id
    HAVING COUNT(f.userid1) > 3
        AND ((COUNT(l.postid) / COUNT(DISTINCT l.postid)) >= 10
            AND (COUNT(l.postid) / COUNT(DISTINCT l.postid)) < 35);

    我快疯了94排了。我不知道为什么。会感谢可能的帮助。


    u.name中不需要distinct,因为聚合将删除副本。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    SELECT
       u."name"
    FROM
       users u
       INNER JOIN friendships f ON u.id = f.userid1
       INNER JOIN posts p ON u.id = p.userid
       INNER JOIN likes l ON p.id = l.postid
    WHERE
       f."timestamp">= '2018-01-01'::TIMESTAMP
       AND f."timestamp" < '2018-02-01'::TIMESTAMP
    GROUP BY
        u."name"
    HAVING
        COUNT(DISTINCT f.userid1) > 3
        AND ((COUNT(l.postid) / COUNT(DISTINCT l.postid)) >= 10
                AND (COUNT(l.postid) / COUNT(DISTINCT l.postid)) < 35);

    如评论所述。当您使用between作为date的do range时不太好。

    1
    2
    f."timestamp">= '2018-01-01'::TIMESTAMP
    AND f."timestamp" < '2018-02-01'::TIMESTAMP

    会给你一个完整的一月。


    试试下面!使用"count(f.userid1)>3"的问题是,如果一个用户有2个朋友、6个帖子和3个喜欢的内容,那么他们将得到2 x 6=12行,因此12条记录的f.userid1不为空。通过计算distinct f.userid2,您可以计算不同的朋友。用于筛选的其他计数也出现类似问题。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    SELECT  u."name"
    FROM users u
    JOIN friendships f ON u.id = f.userid1
    JOIN posts p ON p.userid = u.id
    LEFT JOIN likes l ON p.id = l.postid
    WHERE f."timestamp"> '2018-01-01'::TIMESTAMP AND f."timestamp" < '2018-02-01'::TIMESTAMP
    GROUP BY u.id, u."name"
    HAVING
     --at least three distinct friends
     COUNT( DISTINCT f.userid2) > 3
      --distinct likes / distinct posts
      --we use l.* to count distinct likes since there's no primary key
      AND ((COUNT(DISTINCT l.*) / COUNT(DISTINCT p.id)) >= 10
            AND ((COUNT(DISTINCT l.*) / COUNT(DISTINCT p.id)) < 35);